I have an alert setup which has the following settings:
- For nodes where custom property x is equal to y
- If status equals 'Node Down' then do stuff
- Must exist for more than 5 minutes,
I'm not seeing any flapping in response times, it's been flatlined for packet loss for hours (due to ICMP). I swapped it over to resp/avail over SNMP and it correctly changed status to 'Node Up' then swapped it back to ICMP Ping, and it changed status to down again, as expected. The alert and it's associated triggers, didn't fire off.
Does you system use automated dependency mapping?
The power of dependencies becomes evident when considering alerts. If you have an alert configured to trigger when a monitored object is down, you only want that alert to trigger if a monitored objects is positively down. In other words, you do not want an down object alert to trigger for an object that is not actually down. Without dependencies, all monitored objects on a monitored node that is unresponsive to ICMP queries will also report as down.
http://www.gmigem.com ships with dependency mapping on by default, but not sure how the default install treats it.
May not be related to your issue.
Found out what was going on. The 'SolarWinds Alerting Service V2' has stopped. As soon as I started it, email alerts started flowing again.
I'll put in a request that important service issues such as these are flagged up in some way to admins.