All my linux machines run logrotate between 4am to 5am on Sundays. During this run, snmpd is restarted to free the lock on the log file. Unfortunately, NPM believes a reboot has occured whenever snmpd is restarted because sysUpTime.0 is reset. Is there a way to mitigate this false reboot alert?
/sbin/service snmpd condrestart 2> /dev/null > /dev/null || true
You should modify or disable the Alert called "Alert me when a node Lastboot changes" AND To Disable EventType 14 (Node Rebooted) events run the following SQL Query:
UPDATE EventTypes SET Record=0 WHERE EventType = 14
I tested this and the above query does indeed stop the pesky alerts which are confusing my end users because it says the node rebooted but the node did not reboot, it was just that the SNMP Agent was cycled.
I was getting this false reading because SW checks the SNMP services for up-time status. I didn't want to only use ICMP due to the fact that SNMP has so much useful data. I created a group for my Linux boxes by labeling them Linux in the department field. Excluded them in my node down alert by not adding them to the scope of alert. Thus allowing me to use SNMP for monitoring but not get the node down alert. I added my department name to the alert scope on all other jobs to ensure I'm still getting alerts for all my linux servers. Just to be clear this didn't resolve the issue but rather went around it. I still don't see the accurate up-time for my linux servers. I go old school and just manage them like our grandparents did it, LOGGING IN.
For linux boxes create an UnDP poller and use the hrSystemUptime OID:
It will be a pain to setup because you can't choose what SolarWinds uses to poll uptime (note to SW: make this an option please). What we did was create a custom node property called hrSystemUptime (boolean), then created two alerts - one that uses LastBoot (default) and one that uses hrSystemUptime. For LastBoot reloads it checks to make sure hrSystemUptime isn't set, and for hrSystemUptime reloads it makes sure it is set.
So you have to manually select which ones you'll monitor with hrSystemUptime and apply the UnDP poller, but it will give an accurate system uptime.
If anyone has found any better ideas please let us all know 🙂
I'd be willing to bet this was why I used to get reboot messages from Shoretel switches every night. We ultimately turned off the alert because we knew they weren't reboots but couldn't figure out what was causing the misinterpretation.
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process.