8 Replies Latest reply on Mar 6, 2017 11:56 AM by deverts

    Node isn't down

    boballen

      We a reoccurring situation where a specific node is being reported down by NPM even though it isn't. The node is one of 2 Checkpoint firewalls recently installed, both located behind another firewall. It only occurs on one of the Checkpoints but not the other.

       

      While this doesn't happen everyday, when it does happen it occurs around 15 to 20 minutes after 3 PM. On some days it will reoccur 4 hours later (after 7 PM), and occasionally 4 hours later than that (after 11 PM).

       

      We have logs from the firewall that these nodes are behind and can see the pings from NPM and they don't seem to match the pattern of pings that we would expect from NPM pinging a down node. The "Node Status Polling" for these nodes are set to 60 seconds

       

      From the firewall logs, we can observe for yesterday:

      - A ping is being sent every minute to both nodes

      - At 3:17 PM, an extra ping is sent to one node 15 seconds after the last ping

      - The next ping is sent on schedule

      - 30 seconds later another extra ping is sent to the node

      - No more pings are sent until 3 minutes after the last (extra) ping at which time NPM reports node as down

      - A minute later another ping is sent and NPM reports the node as up

      - At 7:18 PM, an extra ping is sent again to the node 28 seconds after the last ping

      - The next ping is sent on schedule

      - 33 seconds later another extra ping is sent to the node

      - No more pings are sent until 3 minutes and 27 seconds after the last ping at which time NPM report node as down

      - A minute later another ping is sent and NPM reports the node as up

       

      This node was not down for either of these times.And the other Checkpoint is pinged continuously every minute.

       

      What is going on here?