3 Replies Latest reply on Nov 1, 2013 6:14 AM by svindler

    Vmware guests reboot too fast for ICMP polling and alerting

    ahamilton@mcclatchy.com

      We have a mandate from upper management to know when every single node reboots -- we had hoped this would be generally accomplished through the standard ICMP polling and alerting. Using the default polling values seems to work well for most if not all of the HP and Dell physical servers we have, as well as network devices (switches, routers) but we've noticed that our Vmware environment can have guest machines reboot too fast to be alerted against using the standard polling numbers:

       

      Default Node Poll Interval -- 120 seconds

      Node Warning Level  -- 120 seconds

       

      We've been discussing lowering these polling values, but while running a few tests earlier, Vmware hosts can reboot within 60 seconds so changing the ICMP values does not seem appropriate as all we may be doing is increasing the load on the Solarwinds poling engine, and creating false positives.

       

      So, to avoid changing the ICMP polling values and still get alerts on reboots, we've also created an advanced alert using the SNMP value for "Last Boot" - this again works for most servers and network devices... however, we are seeing issues where Vmware guests are not populating the field with the correct date - it can be days or weeks off what is actually the last reboot of that server...? Any ideas as to why this is happening..?  I've seen a few similar threads posted here about this, one specifically about Vmware, so I'm not sure if I should just open a ticket, or if this is a known bug, etc..?


      Solarwinds Orion version = NPM 10.6 - NTA 3.11.0 - SAM 6.0.0 - IPAM 4.0