Part of this will depend on how its desired to define the "5% of their time" criteria. Is your director interested in knowing that from a percentage of polls perspective, or a percentage of time with a user actually working? Also, since you're looking at measuring *sites* and not individual systems, there's a big difference between a site with 100% staffing during the workday exceeding 75% utilization for 5% of their time and a site with only one user breaching that mark because of "after-hours" activity, so there may also be a component as to when you want to evaluate this 5% criteria (e.g. more than 5% of working hours above 75% utilization) ... or maybe, even, exactly the opposite.... which sites have 75%+ utilization for more than 5% of after-working hours. Also, is this 5% based on a daily, weekly, monthly period, or some other period. (It might be that many sites breach 75% of utilization for 5% of a working day, but probably not for the total number of hours in a week or month.)
All of that is relevant, and I'll also throw out that reaching 75%+ utilization for only 5% of working hours is probably quite insignificant, maybe even meaningless regardless of the time period. Consider the following:
- A workday is 8 hours. that's 480 minutes. Five percent of that is 24 minutes. It wouldn't take much to saturate a link at 75% utilization for 24 minutes on any given day (depending on what size the link is, and whether it's during the Olympics, March Madness, or some other high-impact social/Internet activity).
- A workmonth is ~22days x 8 hours = ~176 hours and 5% of that is ~8.8 hours. With this long of a period of time, there's now the relevant question of how many staff persons are at a given site, because it's certainly possible that a 50-person site could produce high utilization of a low-bandwidth link for more than 8 hours a month, and that would be perfectly normal.
My intent is not really to complicate this for you, nor even to digress the original question, but fundamentally this will involve collecting a table of statistical values (current utilization at a specified time over a specified time period), and then counting those discrete poll values, and determining whether the count of poll values >75% is more than 5% of the total number of poll values. If the evaluation is being made against a work week or work month, it may be necessary to first calculate daily, weekly, or monthly averages based on the poll values obtained from NPM. The specific value sets you need may be directly available in NPM, or it may be that you'll first need to create a derived value set based on the raw data that is available. Also relevant to all of this is how the database is configured to aggregate raw polling data.
Now having said that, one of the places you might start is by first configuring alerts to see if ANY of your interfaces are actually exceeding 75% utilization in any case. If none of the interfaces ever exceeds 75% utilization, the rest of the analysis is unnecessary. For an interface that does exceed 75% utilization, you can drill into that interface and see 24 hour performance for the interface. Noting above that 5% of the day would be 24 minutes, a quick glance at the daily utilization chart would allow you to eliminate, or further consider, whether that interface is exceeding the defined parameters. This doesn't produce a report, per se, but it could allow you to obtain sufficient evidence that the report would be inconclusive -- or, alternatively, no doubt the intent of this report is to identify circuits that need to be expanded, and you may be able to obtain that information in short order. If an interface exceeds 75% utilization for more than a half-hour on one day, it's not unlikely that's the case for any other day. Then it's a matter of monitoring that interface to see if the behavior reoccurs. (An important consideration here is whether there's more value in having historical data over some past period of time -- and the effort it takes to build that report (query) -- or whether it's sufficient to identify such interfaces going forward, which may take several days of regular analysis over the next few weeks to determine if there's an interface to be concerned about.
Finally.... regardless of all of the options I offered, if we can further define the above specifications for the "5%" rule... I'm happy to help you develop the queries to get this information.
(Assuming somebody else hasn't already done this... and I always encourage those who have to share with those who want.)
Thanks for the reply. A lot of what you mentioned is how I have been getting this information when needed for particular sites. Generally speaking, our decision to upgrade a site's circuit is based on their exceeding of the 75% threshold (along with response time and other factors). I think this report is just for this director to be able to project upcoming costs that may be incurred by different sites if they are consistently exceeding this threshold. We do a quarterly WAN analysis for each site (we have 80 sites across the US) but there have been times where it 'slips through' between quarters and all of a sudden a site is in desparate need of an upgrade.
I think if we define working hours as between 7am and 7pm M-F Central and just apply the 5% of working hours qualifier, we'd be good. I understand that the sites will all be different based on number of people and circuit size as we vary from a few people in an office on a single T1 line all the way up to a 700 person office with a 100Mbps OC3 and about everything in between. I know some of the WAN interfaces exceed 75% as it stands now as I have several pending orders with VZB to get upgrades in place, but I think we are envisioning this report to be more of a proactive approach to knowing what may need to be upgraded where and when.
Thanks again for any help you can provide.
Have you had a chance to look at this anymore or could provide any more guidance on how I might be able to accomplish this?
My apologies, Jon.
I'm inquiring with those who should know. :-)
The datasource that contains the data to get this analysis will be found in Orion.NPM.InterfaceTraffic.
This might just be my lack of SQL knowledge, but I am completely lost on how I can create this. It keeps coming back to the 5% qualifier that is screwing me up. Any ideas or pointers?
What I was able to write up was:
SELECT TOP 10000 Nodes.Caption AS NodeName
(Nodes INNER JOIN Interfaces ON (Nodes.NodeID = Interfaces.NodeID)) INNER JOIN InterfaceTraffic ON (Interfaces.InterfaceID = InterfaceTraffic.InterfaceID AND InterfaceTraffic.NodeID = Nodes.NodeID)
( DateTime BETWEEN 41755 AND 41786 )
When 0 Then 0
Else (In_Averagebps/InBandwidth) * 100
End >= 75) AND
When 0 Then 0
Else (Out_Averagebps/OutBandwidth) * 100
End >= 75) AND
(DATEPART(weekday, DateTime) >= 2) AND
(DATEPART(weekday, DateTime) <= 6) AND
(DatePart(Hour,DateTime) >= 7) AND
(DatePart(Hour,DateTime) <= 19) AND
(Interfaces.InterfaceName LIKE '%rtr%')
ORDER BY 1 ASC