2 Replies Latest reply on Nov 16, 2016 2:15 PM by jmp99

    Question about availability numbers

    jmp99

      We had a node go down due to a power failure yesterday for half a day. NPM in the web view is reporting 98% availability for the day. As this was a router I checked our ISP Solarwinds instance and it correctly showed the Availability at 50%. I viewed the 'Chart Data' for the node on our Solarwinds and then did the same for our ISP's Solarwinds. It appears the difference is our Solarwinds is not recording 0% for times when the node is down. Our provider is recording a 0% instead of a null value and thus getting a correct availability percentage.

       

      Does anyone know if there a setting somewhere that is preventing our Solarwinds from recording 0% when a node is unreachable? Or is there a different issue?  Any help is appreciated.

        • Re: Question about availability numbers
          mesverrum

          I think the key element in your story is the Unreachable.  This would imply that you have dependencies set up.  When a parent in a dependency becomes unavailable solarwinds does not attempt to poll the downstream objects at all.  So in this case the device is reporting that it was up 98% of the times that solarwinds tried to poll it, and during that long outage I would suspect you only polled once or twice before the dependency kicked in and suppressed further polls.  This is once of the tricky gotchas regarding child objects in dependencies as they will skew the way the application normally calculates availability; unmanaging a node also has a similar effect.  In cases like this I have had to build custom availability reports that use the events table to determine the timestamp when down/unreachable/unmanage events began and then ended and subtract that number of minutes from the total for the month/day/etc.  Tends to be pretty SQL heavy.

           

          One of the reasons that people create dependencies is to cut down on the tidal wave of alerts when a site goes down.  A different method of trimming the noise would be to disable dependencies but to change your node down alert to use the check box on the alert trigger screen for a complex condition. As an example you could have it set to alert when 1 or more nodes are down, then if you set up the action correctly it would be able to trigger a single email that lists off all the nodes that are down at the time it fires.

           

          -Marc Netterfield

              Loop1 Systems: SolarWinds Training and Professional Services

            • Re: Question about availability numbers
              jmp99

              That sounds like our issue.  As if an entire site went down we would expect all the servers, switches, etc. to report as down and reflect lost availability for that time.  Not just the parents of that site.

               

              And that is correct, we are using dependencies to stem alerts.  Is there a tech article on disabling the dependencies (should be able to do on my own) and then setting up the alerts the way you describe?

               

               

               

               

              Jason Palmer

              8001 Belfort Parkway Suite 120

              Jacksonville, FL  32256

              (O) 904-680-3457