Mismatch in interface utilization data

Hi Team,

I have created a report for Interface utilization. In which other teams have some doubts regarding populated data. Please see the below snapshot FYR.

Q:- The organization is asking, why the Average Transmit BPS Data + Average Receive BPS is not equal to Total Average BPS (Transmit+Receive).

Is there any justification for this mismatch in data? 

Parents
  • That is not how it works

    1. Total Average BPS (Transmit + Receive) = (((Average Transmit BPS) + (Average Receive BPS)) / 2 )

    Which is correct

    2. If I have to quote an example 

    - Say suppose you are polling this information for 25 mins where polling interval is 5 mins each so 5 data points

    - Interface A -  Transmit BPS values - 2,2,2,2,2 - Hence the Average Transmit BPS = 10 (2+2+2+2+2) / 5 = Hence finally value is 2

    - Interface A -  Receive BPS values - 3,3,3,3,3 - Hence the Average Receive BPS = 15 (3+3+3+3+3) / 5 = Hence finally value is 3

    - Total Average BPS = (((Average Transmit BPS) + (Average Receive BPS)) / 2 ) = (2+3)/2 = Hence finally value is 2.5

    - Now lets use the values directly rather than deriving it from (Average Transmit BPS) and (Average Receive BPS) - there are 10 data points in total for - Total Average BPS

    - Total Average BPS = ((2+2+2+2+2) + (3+3+3+3+3)) / 10 -> 5 data points for Transmit and 5 data points for Receive - when you calculate this its 2.5

    Its not sum of  (Average Transmit BPS) + (Average Receive BPS) rather is the Total Average from both Transmit and Receive (both included).

    Hope this helps

  • Thank you for your explanation.

    I will now try to communicate this to my other team members. I have made changes to the report by using the total received BPS instead of the average. The data I obtained aligns with the team's requirements. Please refer to the snapshot below. I have a couple of questions related to this report:

    1. Which report (Average calculation or Total Calculation) provides us with the authentic data on actual consumption related to Interface utilization?

    2. What is the difference in the meaning of Average received BPS or Total received BPS?

    The use of this report is to evaluate our interface consumption, Based on this we can reduce or increase ILL or MPLS links to save cost.

  • It actually depends, do you as well have NTA in your environment ?

  • Unless I'm misunderstanding and if this is more about usage with your "save cost" comment, IMHO, I'm not sure an avg over a month really gets at the answer. Though, I supposed it depends on if this is a cost or performance exercise.

    If performance, just be careful since you could see heavy loads for sustained times which get all munched up in the averages, especially by the off-hours times of low usage. If it were me and been there, I tend to look more at trends over the long haul, as well as the max and/or sustained rates over (shorter) times (visual chart/table with more granular info). If this is 24x7 multi shifts, then obviously the time frames in question would need to correlate with the work hours. And if you do see heavy periods of traffic, does this correlate back to something that should or shouldn't be happening.....

    Admittedly, I am new to SW so have much to learn in this space, so I cannot pull out some cool SWQL for you to help. I also hope this doesn't come across as a rant either. But, I've been there more than once when Finance or the Execs get excited over costs and needing to justify things one way or the other.

  • I'm going to jump in with one more comment on this, and yes, I agree with   _ that is a GREAT explanation! 

    Before the rest of your team asks,  you will see differences between bandwidth metrics in the NPM data, and NTA. Just remind them that NTA, while it tracks and shows the large majority of data passing through an interface (and therefore the data you need to see in order to identify and troubleshoot bandwidth issues, created by "bandwidth hogs"), Flow by its very nature, does NOT track ALL traffic passing through an interface. There are types of traffic that are not included in Flow. This is not a function of our product, but rather of Flow itself. For example, Flow is IP based. Anything not IP based isn't tracked in Flow.

    From the perspective of using NTA and its purpose (tracking bandwidth utilization, resolving chronically high utilization issues), what Flow tracks is more than adequate for that use case, but it does not track every bit of traffic passing through the interface.

    This is something I discuss every time I teach NTA of Flow Management, as it comes up frequently. "Why don't my SNMP utilization and Flow metrics match?" How big a discrepancy will depend on how much-unsupported traffic is passing through a monitored interface, but in the years that this product has been out there, I have rarely, if ever, seen the unsupported traffic have any measurable impact on troubleshooting the "big boys" consuming bandwidth.

    Just heading off the often-asked question before it gets asked. Wink

Reply
  • I'm going to jump in with one more comment on this, and yes, I agree with   _ that is a GREAT explanation! 

    Before the rest of your team asks,  you will see differences between bandwidth metrics in the NPM data, and NTA. Just remind them that NTA, while it tracks and shows the large majority of data passing through an interface (and therefore the data you need to see in order to identify and troubleshoot bandwidth issues, created by "bandwidth hogs"), Flow by its very nature, does NOT track ALL traffic passing through an interface. There are types of traffic that are not included in Flow. This is not a function of our product, but rather of Flow itself. For example, Flow is IP based. Anything not IP based isn't tracked in Flow.

    From the perspective of using NTA and its purpose (tracking bandwidth utilization, resolving chronically high utilization issues), what Flow tracks is more than adequate for that use case, but it does not track every bit of traffic passing through the interface.

    This is something I discuss every time I teach NTA of Flow Management, as it comes up frequently. "Why don't my SNMP utilization and Flow metrics match?" How big a discrepancy will depend on how much-unsupported traffic is passing through a monitored interface, but in the years that this product has been out there, I have rarely, if ever, seen the unsupported traffic have any measurable impact on troubleshooting the "big boys" consuming bandwidth.

    Just heading off the often-asked question before it gets asked. Wink

Children
No Data