Open for Voting
over 1 year ago

False reboot alerts due to SNMP service restart & Proper uptime information

Reboot information is related to SNMP service/daemon. It should not be work like this. SNMP service/daemon can be crashed or restarted for some reason (most of the time). And for that it's not a reliable service for tracking reboots. We need other solution (maybe different OID) for this out of the box. There are plenty of threads about this issue. Some of them are below.

linux snmpd restart during logrotate triggers false reboot alert weekly

Better Method of Calculating Uptime

Re: Misleading SNMP Uptime information
Re: events with "...nodename REBOOTED AT...date time"
Uptime SNMP and reboot messages

is this normal? restart SNMP SERVICE and get a REBOOT event

False Uptime/Reboot readings

  • Got false reboot alert on cisco 5K switches. its stange and very critical when we have got reboot alert  from core segment.  

  • Hi,

    I used the below way to achieve the alert, you can try too.

    pastedImage_0.png

  • Please oh please solarwinds! These are the voices of engineers and administrators and users alike, and this would be a gift! I've been fighting this forever. Big up vote from me!

  • There is already option available to customize the cpu, memory, node details. Custom Reboot OID should be available in the pollers option.

  • I doubt that a mere update to net-snmp will change the limitation. The OID for hrSystemUptime is defined as below The value for timeticks is defined as a 32-bit integer (see link to RFC2578 my post above).

    From /usr/share/snmp/mibs/HOST-RESOURCES-MIB.txt:

    hrSystemUptime OBJECT-TYPE

        SYNTAX     TimeTicks

        MAX-ACCESS read-only

        STATUS     current

        DESCRIPTION

            "The amount of time since this host was last

            initialized.  Note that this is different from

            sysUpTime in the SNMPv2-MIB [RFC1907] because

            sysUpTime is the uptime of the network management

            portion of the system."

        ::= { hrSystem 1 }

    [ This text is from a net-snmp 5.7.2 install ]

    To redefine the ceiling value of timeticks, the scope/typecast of the definition would need to be changed to something with a higher ceiling value (such as Counter64!).

    Might be better off exploring the "lastboot" SQL posted by stripet @ Nov 18, 2016 12:54 PM. The other alternative might be to run a script upon hitting a particular OID, see here.

    http://net-snmp.sourceforge.net/wiki/index.php/Tut:Extending_snmpd_using_shell_scripts

  • snmp agent cycled is not confusing, I tested and restarted the snmp service, it is not the cause of issue, I think something goes wrong, can you please advise me what orion is checking in esxi node to provide the reboot alert,

    thanks

    k

  • You can modify or disable the Alert called "Alert me when a node Lastboot changes" AND Disable EventType 14 (Node Rebooted) events by running the following SQL Query:

     

    UPDATE EventTypes SET Record=0 WHERE EventType = 14

    I tested this and the above query does indeed stop the pesky alerts which are confusing my end users because it says the node rebooted but the node did not reboot, it was just that the SNMP Agent was cycled.

  • is it possible with latest snmp version like 5.7, might be if you can use this version with Linux box, you can resolve this issue,

    please test and let us know, might be problem would be resolved,

  • To my knowledge, there is no solution for this. The limitation is programmatic, the integer is limited to 2^32 timeticks.

  • Could you please let us know, what are you waiting ?