Deciding What Flow Data to Keep

In a previous post I discussed the imperative for bandwidth monitoring and management on networks with business-critical purposes and contention for resources due to popular, bandwidth-intensive streaming services—YouTube being the reigning giant.

In this post I want to address a common need to efficiently manage the flow data that is being made available to your traffic monitoring system.

Let's assume that you already have flow-enabled network devices sending data to a flow collector. And, you use the flow data, and the traffic trends they reveal, to monitor network traffic, and to establish QoS priorities for how packets make there way to endpoints on your network.

You may already be faced with a common workflow problem. The collector that is receiving flow data from your routing devices also processes that data for display in some dashboard or console; and the collector is doing its double duty against the same database--into which flow data is loaded and then served up for purposes of statistical calculation.

During peak network use, a firehose of flow data comes into the collector in 1-5 minute bursts. Soon data packets are sitting in a queue as the collector's software processes share CPU cycles to handle their different work. Good collectors are designed to trade-off access to system resources in a way that shorts neither loading nor retrieval of data enough to impact the overall performance of the system, so that the timeliness and accuracy of the information a user sees in the monitoring console are not compromised.

Filtering Flow Data

Only so much collected flow data can be queued for loading into the database before the traffic statistics in the console begin to degrade in accuracy. If the queue holds 15 minutes worth of flow data, for example, then certain views of the data (especially slices of the past hour) become erroneous.

There are solutions to this problem. First, you can architect a flow collection and monitoring system that fully satisfies the CPU and input-output operations per second (IOPS) required to handle peak workload. An Oracle-based system, for example, would cost 60K per CPU to provision.

For most IT shops such cost for a traffic monitoring system is unjustifiable to management. This majority of users need a collector and monitoring system that makes the intelligent trade-offs necessary to allow the console to display reliably accurate information on traffic of the highest importance. If at peak network use you can see timely information on top conversations, then you would settle for not seeing some other things.

SolarWinds Network Traffic Analyzer is an effective NetFlow analyzer, for example, uses a data aggregation strategy to ensure that current information on top conversations, endpoints, and applications is always available in the monitoring console.

Thwack - Symbolize TM, R, and C