1 Reply Latest reply on Apr 9, 2009 2:22 PM by salparadise

    node status interval

      (I apologize if this has been addressed)

             Hello,
      We have a simple alert that sends an email when it detects that a node is down. Recently we have been seeing some of our servers reboot and this alert is not triggered. Further investigation shows that NPM never detected the node as down. We ran a test capturing ICMP packets from the NPM server and noticed that there was a gap of ~ 3 minutes or a gap of ~ 7 seconds, which doesn't correlate to what we have set up on our NPM settings.

      Based on our settings we should be seeing an alert in less than a minute that a server is not able to respond.

      Here is a sample of when the icmp packet is sent from the solarwinds server:

      13:01 icmp> SELECT timestamp FROM PING WHERE src = '172.16.130.81' LIMIT 30;
      +-----------+
      | timestamp |
      +-----------+
      | 00:55:25  |
      | 00:55:26  |
      | 00:56:13  |
      | 00:56:24  |
      | 00:59:50  |
      | 00:59:59  |
      | 01:02:46  |
      | 01:02:56  |
      | 01:06:09  |
      | 01:06:26  |
      | 01:09:04  |
      | 01:09:25  |
      | 01:12:13  |
      | 01:12:24  |
      | 01:15:14  |
      | 01:15:26  |
      | 01:18:24  |
      | 01:18:36  |
      | 01:21:18  |
      | 01:21:31  |
      | 01:23:55  |
      | 01:24:23  |
      | 01:27:16  |
      | 01:27:24  |
      | 01:30:33  |
      | 01:30:45  |
      | 01:33:18  |
      | 01:33:27  |
      | 01:36:39  |
      | 01:36:36  |

       

      I am monitoring 600 devices, not sure if that info is relevant.

      Attaching my server settings.