For many engineers, operators, and information security professionals, traffic flow information is a key element to performing both daily and long-term strategic tasks. This data usually takes the form of NetFlow version 5, 9, 10, and IPFIX as well as sFlow data. This tool kit is widely utilized, and enables an insight into network traffic, performance, and long-term trends. When done correctly, it also lends itself well to security forensics and triage tasks.
Having been widely utilized in carrier and large service provider networks for a very long time, this powerful data set has only begun to really come into its own for enterprises and smaller networks in the last few years as tools for collecting and, more importantly, processing and visualizing it have become more approachable and user-friendly. As the floodgates open to tool kits and devices that can export either sampled flow information or one to one flow records, more and more people are discovering and embracing the data. What many do not necessarily see, however, is the correlation of this flow data with other information sources, particularly SNMP-based traffic statistics, can make for a very powerful symbiotic relationship.
By correlating traffic spikes and valleys over time, it is simple to cross-reference flow telemetry and identify statistically diverse users, applications, segments, and time periods. Now, this is a trivial task for a well-designed flow visualization tool. It can be accomplished without even looking at SNMP flow statistics. However, where it provides a different and valuable perspective is in the valley time periods when traffic is low. Human nature is to ignore that which is not out of spec, or obviously divergent from the baseline. So, the key is in looking at lulls in interface traffic statistics. View these anomalies as one would a spike, and mine flow data for pre-event traffic changes. Check TCP flags to find out more intricate details of the flows (note: this is a bit of a task as it entails adding TCP flags as they are exported as a numerical value in NetFlow v5 and v9, but they can provide an additional view into other potential issues). Conversely, the flags may also be an indicator into soft failures of interfaces along a path, which could manifest as SNMP interface errors that are exported and can be tracked. Think about the instances where this may be useful: soft failures. Soft failures are notoriously hard to detect, and this is a step in the right direction to doing so. Once this kind of mentality and correlation is adopted, adding in even more data sources to the repertoire of relatable data is just a matter of consuming and alerting on it. This falls well within the notion and mentality of looking at the network and systems as a relatable ecosystem, as mentioned in this post. Everything is interconnected, and the more expansive the understanding of one part, the more easily it can be related to other, seemingly “unrelated” occurrences.
This handily accomplishes two important tasks: building a relation experience table in an engineer or operators mind, and, if done correctly, a well-oiled, very accurate, efficient, and documented workflow of problem analysis and resolution. When this needs to be sold to management, which will need to occur in many environments, proving out that most of these tracked analytics can be used in concert with each other for a more complete, more robust, more efficient network monitoring and operational experience may need some hard deliverables, which can prove challenging. However, the prospect of “Better efficiency, less downtime” is typically enough to get enough interest in at least a few conversations.