Got an alert on one of our DNS/DHCP Servers (InfoBlox). When I went to look at it everything was up. However, I never got an up email message. Anyone know why?
No, the "lastReboot" value changed (not necessarily because the device rebooted, most likely a counter rollover) which triggered the alert due to "last boot has changed".
I don't think your device actually went down, or if it did, it was too quick for NPM to notice.
But yes, clear the alert, and reconfigure as mentioned above.
Please share screenshots of your configuration for the alert you received.
Here is the alarm status and the alarm rule reset
Are you sure that the LastBoot value had not changed? This can happened even though the node didn't reboot if the SNMP service restarts.
It looks like from your screenshot that the alert is still triggered, (even though the node is up) - this would by why you haven't received a reset email.
It's generally a good idea to keep alert config to as limited amount of criteria as possible so, I would suggest splitting the up/down alert and reboot alert into two different alert configs.:
Alert one:
Trigger
Node status is equal to down
Reset
Node status is equal to up (or slightly more useful - Node status is not equal to down)
and the reboot alert (alert two):
Lastboot has changed
(no reset for reboot alert)
Hope this helps
Maybe its me being stupid but the issue is you never got an up email?
Well you wouldn't....
You currently have it set to only alert as up when the node is back up and the last boot changes. Well the last boot will change when the node goes down say, but then when the node comes back up it wont have changed since the down email. there for it wouldn't trigger a reset condition as it technically hasn't detected a another change in last reboot date.
Thats my personal interpretation of that reset condition. I might be wrong
Close, but the reset condition is an OR operator.
That being said, my theory is that the alert triggered due to last boot change, with the node up, so reset would have to be another last boot change, therefore you are correct - the condition is misconfigured.
So if I am understanding this right, a reboot happened and an alert went out. The status went down but up again before the poll occurred thus nothing to flip and I should just clear the alert. Yes?