This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

How can I have netflow top application report match the NPM interface utilization report?

As the title said, how can I have netflow top application report match the NPM interface utilization report?

I am asking this because I want it when NPM reports me that my bandwidth got high utilization, I can tell what kind of application traffic that is utilizing that much of bandwidth at that time.

However, I got confused as the total utlization number between netflow top application chart and NPM interface percent utilization chart is different

for example:

I have a 40 Mbps bandwidth from my ISP (I have modified the bandwidth to 40Mbps in the node configuration from fast ethernet default 100Mbps)

My router is cisco 1800 series, I only have flow enabled both ingress & egress in one interface of the router, i also have configured ip flow-cache timeout active 1

pastedImage_1.png

From this graph I can know that between 13:10 - 13:14 my link is utilized 43,47% (recv) and 45,65% (xmit)

However, if I observe the total utilization from netflow top applications graph in the same time range is not the same. For example, in below graph:

pastedImage_3.png

Total traffic inbound 27.54% + 19.49% + 0.3% + 0.03% = 47.36% --> compared to inbound 43,47%

Total traffic outbound 37.02% + 4.64% + 1.05% + 0.02% + 0.01% = 42,74%--> compared to outbound 45,65%

From what I can say, I dont limit my application monitoring only to only monitor some specified apps, so the traffic recorded in netflow should be all traffic that are "passing" that interface right?

And I can confirm that there are no bandwidth shaper appliance between the links, so everything is as it is.

Please kindly advise.

  • What is your polling interval in NPM for this node for the node and interfaces?

    Since NTA is pretty much 1 minute I would make sure to have 1 minute polling on the device as well...and I mean for interface polling not node polling, although I would probably do 1 minute polling across the board if this device were very important.

    Now remember, 1 minute polling on the node is not as resource intensive on your SW poller, however, interface polling is directly relational to Polling completion, so only do high interface polling if you have the server resources to do this.

    Thats where I would start

    I will also do some research on my system to see if I have matching numbers between NTA and NPM for a node I am polling @1 minute intervals

    Jason

  • Hi jxchappell, thanks for the reply.

    Okay so in summary, if I change polling interval to 1 minute, the "burden" will be on my NPM server, right?

    Will there be any impact on my monitored router as well? I am concerning about my router more than the NPM server - since this is like a main gateway in a production network.

    Average CPU usage of the router with this existing condition is around 40-50%.

  • can anyone confirm will it bring impact to my router performance if I change the polling interval to 1 minute (currently set to9 minutes)

  • The load on your Switch/Router should be minimal...you can do some testing with a before and after check by changing the polling to 1 minute on only a single device first.

    The biggest load within NPM will also be when you poll the interfaces more than the node as well...these are 2 separate settings within the tool

    Does that help?

  • You can check the CPU and Memory utilization graph after and before changing the polling interval that's the purpose of the having Orion monitoring system .

    Further more you can do list resources and take off extra polling load from this device and then compare again

    Such as HH / Routing tables / Topology polling this will further reduce the polling load from the device .

    pastedImage_0.png

  • I'm surprised nobody has brought up that NTA by default is set to do 95th percentile top talker filtering.  As in it doesn't even bother to store the data for the last small percentage of flow data it receives.  There is a serious benefit to performance as well as database size to doing this, see this KB

    Top talker optimization

    Also some devices already do this kind of sampling calculation on their end before they even send the flow data out, although I don't believe this Cisco is one of them, but they are out there.  Between that and the the small differences in polling time intervals I would never expect Netflow/IPFIX/Sflow to be a 100% match to what you see in SNMP.  You might be able to get the gaps a little closer but you will have to weigh that against the other performance and polling factors that have already been brought up.

    -Marc Netterfield

        Loop1 Systems: SolarWinds Training and Professional Services

  • Malik Haider, Thanks for the advise emoticons_happy.png

  • messverum, I never thought about this before. Nice to know! Thanks for bringing it up