I think the key element in your story is the Unreachable. This would imply that you have dependencies set up. When a parent in a dependency becomes unavailable solarwinds does not attempt to poll the downstream objects at all. So in this case the device is reporting that it was up 98% of the times that solarwinds tried to poll it, and during that long outage I would suspect you only polled once or twice before the dependency kicked in and suppressed further polls. This is once of the tricky gotchas regarding child objects in dependencies as they will skew the way the application normally calculates availability; unmanaging a node also has a similar effect. In cases like this I have had to build custom availability reports that use the events table to determine the timestamp when down/unreachable/unmanage events began and then ended and subtract that number of minutes from the total for the month/day/etc. Tends to be pretty SQL heavy.
One of the reasons that people create dependencies is to cut down on the tidal wave of alerts when a site goes down. A different method of trimming the noise would be to disable dependencies but to change your node down alert to use the check box on the alert trigger screen for a complex condition. As an example you could have it set to alert when 1 or more nodes are down, then if you set up the action correctly it would be able to trigger a single email that lists off all the nodes that are down at the time it fires.
Loop1 Systems: SolarWinds Training and Professional Services
That sounds like our issue. As if an entire site went down we would expect all the servers, switches, etc. to report as down and reflect lost availability for that time. Not just the parents of that site.
And that is correct, we are using dependencies to stem alerts. Is there a tech article on disabling the dependencies (should be able to do on my own) and then setting up the alerts the way you describe?
8001 Belfort Parkway Suite 120
Jacksonville, FL 32256