3 Replies Latest reply on Jan 17, 2019 8:39 AM by rschroeder

    Discrepancy in availability numbers between "Node details" page and SQL query

    cmorales

      Hi All,

       

      I am creating a custom report for availability of some of our nodes but I'm finding different results from what I get using below query and what I get from the front end. Any idea on what could be causing this? or, which one do I trust?

       

      SELECT  AARD.ApplicationName, AVG(CONVERT(FLOAT,ADCA.PercentAvailability*1.000))

         FROM  APM_AlertsAndReportsData AARD, APM_DailyComponentAvailability ADCA

         where AARD.ApplicationID = ADCA.ApplicationID

         and (datetime >= DATEADD(month, -1, getdate()) AND datetime < getdate())

         AND AARD.ApplicationID = 55--57,28,59,55,48,29)

         GROUP BY AARD.ApplicationName ORDER BY AARD.ApplicationName ASC;

       

      This is with the query, calculating the availability for this month so far gives me a result of 99.94%

      However from the front-end it always looks 100%

       

        • Re: Discrepancy in availability numbers between "Node details" page and SQL query
          zackm

          your screenshot is from the Node Details page, but your SQL query is looking at the Application Availability. If you want the Node availability, look for the "ResponseTime" tables and/or views.

          • Re: Discrepancy in availability numbers between "Node details" page and SQL query
            rschroeder

            I recently went through a similar exercise where I wanted to know the uptime of all nodes, and be able to sort on them in descending order.  The goal was to see which nodes had not been rebooted or upgraded or had a power "event" in the last 18 months, which sometimes is associated with Cisco IOS bugs.

             

            discovered the snmp OID resets after it reaches a certain number of seconds (~490,000) and the data immediately becomes skewed / inaccurate if you rely on that method.  An affected node might have been up for 19 months but would show up in the report as up for a single month.

             

            I found a lovely alternative that is accurate even if a node has been up for seven years or longer.  PM me if that would be of use to your tasks.

             

            Swift Packets!

             

            Rick Schroeder

            • Re: Discrepancy in availability numbers between "Node details" page and SQL query
              rschroeder

              I opted to go for a Report format instead of putting this info into a Resource in my NPM pages, but the process should be similar if you want a resource that shows ALL Nodes' uptime.  Here's how I did mine, through help from those on Thwack who provided the SQL script and advice:

               

              First create a New Report:

               

              Next edit Datasource 1 as follows for SQL script from Thwack:

              The actual SQL script is:

              SELECT

                    NodeID

                    ,Caption

                    ,VendorIcon

                    ,Ip_Address

                    ,DateDiff(hour,Nodes.LastBoot,GetDate()) AS HoursUp

                    ,CONVERT(VARCHAR(40), DATEDIFF(minute, Nodes.LastBoot, GETDATE())/(24*60))

                          + ' days, '

                          + CONVERT(VARCHAR(40), DATEDIFF(minute, Nodes.LastBoot, GETDATE())%(24*60)/60)

                 + ' hours, and '

                 + CONVERT(VARCHAR(40), DATEDIFF(minute, Nodes.LastBoot, GETDATE())%60)

                 + ' minutes.' AS 'Time Up'

              FROM [Nodes]

              ORDER BY 5 DESC

               

              Click "Add to Layout"

               

              Then click the Edit Resource button to choose your columns, drag them to whatever order you like, and hide any you wish hidden.  I usually add NODEID and then hide it.  SW can use it for building the report; users don't need to be bothered with this info.

               

              I Grouped the report's output by Time Up, and then I included an IP_Address column to make it easy to see what address to SSH to for validation/verification of the data.

               

              I sorted by Descending so I see the nodes with the greatest up time at the top of the report.   The devices with the greatest up time may be the ones with the greatest likelihood of having problems--IF they are Cisco switches or routers.  If they are UPS's, I love to see high up time.  You may decide to exclude certain vendors or machine types so you only see Cisco gear, or only see specific hardware.

               

              You can hide or delete the HoursUP column if it makes things simpler for your users.

               

               

              You could choose to Schedule this report to run at regular intervals, or just rely on opening up at need.

               

              I suspect you might also be able to add this as a Resource/Widget to an NPM view.  If you choose that option then you may be wise to limit the display to the Top 10 items so your NPM page isn't slow to load, or isn't extremely length.

               

              Here's a sample/preview of what my report looks like:

               

               

              Here's hoping you can leverage this to fit your needs!

               

              Swift Packets,

               

              Rick Schroeder