6 Replies Latest reply on Apr 21, 2015 11:07 AM by dhanson

    Downtime report for Interface

    avrilspirit

      Hi Everybody,

       

      I would like to create a specific report which displays :

       

      How many minutes my bandwidth is greather than 70% per mounth to know if we need to upgrade the Wan link.

       

      I create an alert so information is available in events table but is it possible for you to create this kind of report ? (like a capacity planning report only for interface utilization).

       

      I saw a lot of report about downtime only for "Down device or Interface" but not for a specific threshold.

       

      Thanks for your help

        • Re: Downtime report for Interface
          rob.hock

          In NPM 11.5 we do have new capacity planning resources and report. Check out sample resource here: http://oriondemo.solarwinds.com/Orion/SummaryView.aspx?ViewKey=Network%20Top%2010

          • Re: Downtime report for Interface
            dhanson

            Assuming you are using 11.0 or older, you can build a custom SQL report in the Report Writer on the server. Just select "Advanced SQL" when you are creating a new report and throw the below query in. To test this, you should be able to access your Database Manager application on the server, and this should return an example table from there as well.

             

            So here's the query. This is kind of a messy way to do it, as I'm kind of short on time for optimizing, but it'll work. The end result will have 4 columns and each row will be one interface per node. You might see a node appear twice: that means 2 interfaces are affected.

             

            Replace bolded text with the NodeID's of the appropriate devices that you want to filter on.

            WHERE i.NodeID IN (#,#,#,#)

             

            If you have only one device you want to filter on, that line should read:

            WHERE i.NodeID = '#'

             

            You should be able to find the NodeID's for your devices in [dbo].[Nodes].

             

            WITH a AS

            (

            SELECT TOP 100000 i.NodeID, i.InterfaceID, i.In_Averagebps, i.Out_Averagebps, b.InBandwidth, b.OutBandwidth FROM [dbo].[InterfaceTraffic_Detail] AS i

            JOIN [dbo].[Interfaces] AS b ON b.InterfaceID = i.InterfaceID

            WHERE i.NodeID IN (2,68,23,17,24)

            AND i.DateTime BETWEEN DATEADD(Day, -30, getdate()) and getdate()

            AND i.In_Averagebps > 0

            AND i.Out_Averagebps > 0

            AND b.InBandwidth > 0

            and b.OutBandwidth > 0

            GROUP BY i.NodeID, i.InterfaceID, i.In_Averagebps, i.Out_Averagebps, b.InBandwidth, b.OutBandwidth

            HAVING 100 * i.In_Averagebps / b.InBandwidth >= 70 OR 100 * i.Out_Averagebps / b.OutBandwidth >= 70

            )

            SELECT n.Caption AS Hostname, i.Caption AS 'IF_Desc', count(a.InterfaceID) AS Quantity FROM a

            INNER JOIN [dbo].[Nodes] AS n ON n.NodeID = a.NodeID

            INNER JOIN [dbo].[Interfaces] AS i ON i.InterfaceID = a.InterfaceID

            GROUP BY n.Caption, i.Caption

            ORDER BY Quantity DESC

            ;

             

            Translation of what this does:

            To create temp-table A:

            -For Nodes 2,68,23,17, and 24, if these nodes have an entry within the last 30 days, as long as the values in 4 specific columns are not 0, create a temp-table. Return results in that table that have an average IN or OUT average BPS of 70% of it's respective bandwidth.

            To select results from Temp-Table A:

            -Using temp table A, convert the NodeID and InterfaceID to the appropriate captions for these objects. Also, get a count of the product of temp table A based on the quantity of instances of each interfaceID. e.g. If the results of the first table show InterfaceID 100 times, return the node's name, interface's description and the Quantity 100.

             

            What that Quantity really means is that: during your poll interval, the corresponding device has reported a bandwidth utilization in excess of 70%, and this count is how many times it has reported that. Divide by your poll interval, and you have minutes in the last 30 days that it has reportedly exceeded 70% utilization. If your poll rate is every minute, it's 1:1.

             

            If you SQL guru's can check my logic, I'd appreciate it, but this looks like it is working for me.

             

            HTH

            • Re: Downtime report for Interface
              avrilspirit

              Sorry for the delay but I don't know why but I haven't received notifications...

               

              Thanks for your response

               

              rob.hock  I'm currently on 11.5 but this report doesn't display the time

               

              dhanson Thanks a lot for your SQL query it works without problem, the only issue is : for me it's not important if it's a simple peak during only one period of collect... for me it's important to know : how many time my bandwidth is greather than 70% per mounth during more than X minutes consecutively.

              Is it possible with your query ?

               

              Thanks for your help.

                • Re: Downtime report for Interface
                  rob.hock

                  Apologies for the difficulty- can you perhaps post screenshot of what you are seeing?

                  • Re: Downtime report for Interface
                    Craig Norborg

                    Hmm...   There are a couple of difficulties I see in getting this knowledge out of Orion.  The first being the polling interval of interface statistics on your interfaces.  By default this is set to 9 minutes - not positive because we have ours at 10 minutes.  But, this means every 9 minutes Orion should go poll how much data has gone across your interfaces during that time period and use that value to calculate the average bps over those 9 minutes and apply that as the value of the # of bps that went across the interface.   If you want to know how many individual minutes were above 70%, you would have to increase this polling interval so it polled each interface every minute.  Either that or figure out what interval works for you and go with that...   Otherwise let's say over those 9 minutes Orion comes back and says it was at 50% utilization.   That could mean all kinds of things, like it was at 100% utilization for 4.5 minutes and 0% utilization for the other 4.5 minutes, or that it was at 50% utilization for the whole 9 minutes or any other number of infinite combinations.

                     

                    The second thing would be how Solarwinds rolls up data.     The statistics polled during the period I mentioned above is what is known as "detailed statistics".   These detailed statistics are not kept in their raw form forever.   They are "rolled up" into less detailed statistics for longer time periods to conserve space and lessen processing requirements to draw graphs and such.   I believe by default after 14 days it rolls them up into hourly statistics, then I think the next default is 30 days where it rolls them up into daily stats, which it keeps for a year...     So, even if you change the polling period for detailed stats, to get the reports you want you might have to change the rollup periods also.

                     

                    Note, that doing detailed stats per minute and moving retention of said stats back to a month or more could have a huge impact on your server in terms of storage space used and CPU needed to do simple things like display a node or something.   How much would depend on your environment.

                     

                    You might want to look at maybe using NTA (Netflow) to do this.   If you set the "flow-cache timeouts" appropriately (ie: under a minute) you should get much more detailed statistics with less impact on your server, and the FSDB (flow-storage DB) is much better geared to storing with and reporting on these types of statistics on a much more detailed manner I believe.  I'm not too familiar with doing custom reporting on NTA with the FSDB...    Anyone else have a comment on this?

                    • Re: Downtime report for Interface
                      dhanson

                      My query gives you the total number of polls in your [InterfaceTraffic_Detail] table that are in excess of 70% utilization. As Craig Norborg said, how this translates is based on your poll rate, and your retention settings. My statistics poll rate is 1 minute (small environment) and my data retention is 30 days of per minute polling, 180 days of hourly, and a year of daily. So for me, this query returns "How many minutes my bandwidth is greather than 70% per mounth to know if we need to upgrade the Wan link", per your original request. As I explained, if you multiply the quantity result by your poll rate, this is how many polls during your retention period that are in excess of 70%. So if your poll rate is 2 minutes, and your retention is 15 days, and the quantity result is 150, then (150*2 = 300 minutes)/ 15 days. Mine is (150*1)/30.


                      As far as getting consecutive intervals, I'm sure this query is possible, but it will be rather complex, and is currently beyond what I could easily perform. I think you'd be better served by a graph on NTA, which will give a nice graphical representation of your utilization statistics that you can share with your superiors.