6 Replies Latest reply on Apr 26, 2018 9:31 PM by ebradford

    Standard Deviation in collected data?

    jbiggley

      I was talking through the data collection (default) intervals with one of our VMware architects earlier this week.  This is what I told him:

       

      VMAN polls every 10 minutes via the vSphere API

      Data is collected in 10 second blocks for the entire period (60 blocks by default)

      Data presented via the console is the peak value, average value, etc. for the 10 minute period.

       

      "Wait a second," said the architect, "if the data is an average can you show me the standard deviation for that 10 minute period along with the average and peak?"

       

      "Uhhh... maybe?"

       

      Does anyone know if this is possible?  Is the standard deviation stored anywhere for us to display in a graph or report?

        • Re: Standard Deviation in collected data?
          chrispaap

          Joshua

          We can provide the average and peak but we do not provide the standard deviation.  Was standard deviation a must have for the architect or only for certain scenarios?

          1 of 1 people found this helpful
            • Re: Standard Deviation in collected data?
              jbiggley

              It was only in certain scenarios, although I do find that more and more folks are asking for detailed data analytics and statistical information to help them understand that data that Solarwinds products are giving them.  Standard deviation might be an interesting metric to stick into the development roadmap of a whole bunch of products.

            • Re: Standard Deviation in collected data?
              ebradford

              We should configure Thwack to send out ALERT: OLD THREAD RESURRECTION e-mail alerts, with a Sev 1 priority, LOL.

               

              Case in point, I am looking to show standard deviations for interface throughput, errors, Multicast and Broadcast traffic, etc. I searched Thwack for other people's use of Stdev, and found this unanswered thread. I'm resurrecting the thread to share what I find, and maybe then jbiggley can return to his VMWare Architect some 3-1/2 years later, and say, I've got that information you wanted...

               

              The idea I am working on is to develop data to understand the nature of normal traffic levels over interfaces. We (Network Admins) can then use this data to determine what levels of traffic equals abnormal traffic, and what equals abnormal. For example, we could use Standard Deviations of Multicast and Broadcast traffic levels for planning QOS policing policies, to throttle Multicast data over a certain level. Also, alerts could be configured for Multicast/Broadcast/Errors/BPS/PPS traffic exceeding normal levels, for example 3 standard deviations above Mean would infer something statistically significant is happening, and that could be worthy  of making a note of.

               

              There is a SQL function called Stdev. It works very much like average (AVG) and Min (MIN) and Max (MAX) functions. So, what I did was just grabbed a Average/Max per Month report from the old Report Writer, and used that as a basis. This is still in progress, but I thought I'd share this, see where it leads. The below has unnecessary lines which I bleeped out with double dashes. I'll post the end result when I am through.

               

              SELECT 
                 CONVERT(DateTime, LTRIM(MONTH(DateTime)) + '/01/' + LTRIM(YEAR(DateTime)), 101) AS SummaryMonth
                 , n.NodeID AS NodeID
                 , i.InterfaceID AS InterfaceID
                 , n.Caption AS NodeName
              --   , n.VendorIcon AS Vendor_Icon
                , i.Caption AS Interface_Caption
              --   , i.InterfaceIcon AS Interface_Icon
                , it.In_AvgMultiCastPkts
                , AVG(it.In_AvgMultiCastPkts) AS AVG_of_MC_Pkts
                , STDEV(it.In_AvgMultiCastPkts) AS STDev_of_MC_Pkts
                , it.In_MaxMultiCastPkts
                , MAX(it.In_MaxMultiCastPkts) AS MAX_of_MC_Pkts
              --                         , AVG(it.Out_Averagebps) AS AVERAGE_of_Average_Transmit_bps
              --                         , MAX(it.Out_Maxbps) AS MAX_of_Peak_Transmit_bps
              
              FROM  
              (Nodes n INNER JOIN Interfaces i ON (n.NodeID = i.NodeID)) INNER JOIN InterfaceTraffic it ON (i.InterfaceID=it.InterfaceID)
              
              
              WHERE 
                 ( DateTime BETWEEN 41000 AND GETDATE() )
                 AND n.Caption LIKE 'Host%'
                 AND i.InterfaceID = 5476
              
              GROUP BY 
                CONVERT(DateTime, LTRIM(MONTH(DateTime)) + '/01/' + LTRIM(YEAR(DateTime)), 101) 
                , n.NodeID
                , i.InterfaceID
                , n.Caption
              --  , n.VendorIcon
                , i.Caption
              --  , i.InterfaceIcon
                , it.In_AvgMultiCastPkts
                , it.In_MaxMultiCastPkts
              
              ORDER BY SummaryMonth ASC 
              -- , 4 ASC
              -- , 6 ASC
              
              
                • Re: Standard Deviation in collected data?
                  ebradford

                  Okay, so there are some problems with the above SQL script. As I said it is a work in progress. But, yes, the STDEV function does work. I'm learning a bit about setting time ranges, and intervals with SQL, and about which statistics are available in Orion's databases. Multicast traffic is what is important to me, but I don't see raw multicast traffic in the DB. I see MIN MAX and AVG for MC traffic in the DB, but then the STDEV wouldn't be STDEV on base MC rates, it would be STDEV on Averages or Minumums or Maximums. This is not quite what I want. I'll post more as I come up with more. But maybe we'll be able to answer the STDEV question for other metrics.

                    • Re: Standard Deviation in collected data?
                      ebradford

                      So, here's where I am now. I have found out that I can't access the raw Multicast data with the NPM database as it is. I may be able to create a Custom Poller to Poll the data directly from the interfaces. As it is, we can access Min, Max and Averages of Multicast data, in and out of an interface. If was take a Standard Deviation of those numbers, we would have the StDev of the Averages (or Mins or Maxes), but not the StDev of the actual metric I want to access.

                       

                      No, I realize that this message was originally posted regarding VM data polls. From what jbiggley posted, "Data presented via the console is the peak value, average value, etc", the data he is looking at is in a similar situation -- Min Max and Average is available, but now the raw data. So, this is an issue with presentation -- I'm sure the data must be somewhere, but I can't find it. I'll follow-up on using a custom poller, as I have time.

                       

                      But.

                       

                      I did want to share with you, how to do a StDev anyway -- whether or not it is on the raw data, so maybe that will be of use to you (or someone) at some point. Select the tables you want, and then create a temporary table for values over time by nesting Select operations.

                       

                      SELECT 
                        [NodeID]
                          ,[Day]
                          ,[Hour]
                          ,[Sub1].[InMcastPps]
                          ,[Avg_MCPktIn] = AVG([Sub1].[InMcastPps])
                          ,[StDev_MCPktIn] = STDEV([Sub1].[InMcastPps])
                          ,[CI95_MCPktIn] = 1.96 * (STDEV([Sub1].[InMcastPps])/SQRT(COUNT(*)))
                          ,[Avg_MCPktOut] = AVG([Sub1].[OutMcastPps])
                          ,[StDev_MCPktOut] = STDEV([Sub1].[OutMcastPps])
                          ,[CI95_MCPktOut] = 1.96 * (STDEV([Sub1].[OutMcastPps])/SQRT(COUNT(*)))
                          ,[N] = COUNT (*)
                      FROM ( 
                        SELECT  TOP 2000
                          [n].[NodeID]
                            ,[it].[DateTime]
                            ,[Day] = convert(int,CONVERT(char(8),CONVERT(date,[it].[DateTime]),112))
                            ,[Hour] = DATEPART(hh,[it].[DateTime])
                            ,[Minute] = DATEPART(mi,[it].[DateTime])
                            ,[it].[In_AvgMultiCastPkts]
                            ,[i].[InMcastPps]
                            ,[i].[OutMcastPps]
                        FROM  [Nodes] [n] WITH (NOLOCK)
                        JOIN  [Interfaces] [i] WITH (NOLOCK) ON ([n].[NodeID] = [i].[NodeID]) 
                        JOIN  [InterfaceTraffic] [it] WITH (NOLOCK) ON ([i].[InterfaceID]=[it].[InterfaceID])
                        WHERE 
                         ( [it].[DateTime] BETWEEN (DATEADD(DAY,-7,GETDATE())) AND GETDATE() )
                         ) [Sub1]
                      GROUP BY [NodeID],[Day],[Hour],[Sub1].[InMcastPps]
                      ORDER BY [NodeID],[Day] ASC,[Hour] ASC
                      

                       

                      So, the above shows a Select operation on Nodes, Interfaces, and Interface Traffic, and all that is known as [Sub1]. This is lines 14 -28.

                      That creates a table of dates and times for some Multicast packet stats found in the Orion DB. then lines 1-13 and 29-30 take that data and workout Average, StDev, and 95% Confidence Interval for Multicast Packets/sec In and Out. This would normally seem like pretty good data -- but there's something wonky about the pps data in the database. I feel like it should change more often than it does. [Or, maybe it is just being collected at the wrong intervals -- LOL I just thought of that while typing -- isn't it strange when you are explaining something to someone else that suddenly your brain makes a connection that it wasn't all day? Funny how the brain works.] Anyway. These stats could tell an analytics Beavis the numbers they want to know. And, if the PPS are correct, then you should be able to use that information to determine which thresholds to set your Multicast Policing/Shaping levels in such a way that most MC traffic is unhindered, and only extreme peaks are throttled.

                       

                      I haven't figured out how to alert using this, since when I looked at alerts, I didn't see a way to use custom SQL query as an alert trigger. Any ideas?

                    • Re: Standard Deviation in collected data?
                      jbiggley

                      This is awesome!  Alas, our VMware architects found what they were looking for elsewhere.