2 Replies Latest reply on Sep 27, 2018 12:37 PM by javip2086

    Groups and Availability

    pflanz

      I understand that Groups and Availability was a contentious thing prior to the recent releases.

       

      Now we have two metrics:  Group Availability and Group Member Availability.

       

      Supposedly Group Availability is a Legacy hold over, and is actually a faulty calculation.

       

      Group Member Availability seems to be a misnomer as it is actually Group Availability based on Group Members. 

       

      In my case I have a Test Group that contains one server.  

       

      The server (node) availability for last month is 100%, however it did have some unmanaged time for planned maintenance.

       

      The Group Member Availability for last month shows less than 100%.

       

      This leads me to conclude that Group Member Availability counts unmanaged time as down time  (Unavailable)

       

      Could this be true?

       

      What is the effect of Group Status  on Group Member Availability?   Or is Group Status simply for display purposes?

        • Re: Groups and Availability
          bbuhler

          I'm seeing the same result. This is making it very difficult to provide accurate numbers to upper management.

          • Re: Groups and Availability
            javip2086

            Hi pflanz, good morning.

             

            Actually I'm checking the same thing and I find this, maybe it could help you understand how the availability works.

             

            Maybe it have some changes with the version of NPM, but I think this is the basic.

             

            Regards.

             

            Availability table: 100% availability in report but also 100% packet loss - SolarWinds Worldwide, LLC. Help and Support

            Overview

            This article provides information on how the Availability table works.

             

            Environment

            NPM 11.5.3

             

            Detail

            By default, we poll every 120 seconds for availability.

            This is done with ICMP (Ping).

             

            If the node responds

            • It is marked as 100% available and the response time is stored.

            If the node does not respond, a fast ping is sent.

            • This repeats according to the Response Time Retry Count value for your polling engine.
            • This setting designates the number of times Orion retries ICMP pings on a monitored device before packet loss is reported.
            • If the node responds to the fast ping, it is marked as 100% available and 100% packet loss to represent that it responded, but not to the main ICMP poll. The response time is not stored.

            If the node does not respond to any of the above:

            • It will be marked as 100% loss and 0% available for that poll.

             

            As such it is possible to have 100% packet loss and 100% availability.

            As ICMP is a low priority packet, the node may be too busy to respond to the poll but the node appears to be operating fine.