IP based networks are more complex today than ever before. Not only are they now supporting voice, video, and data, but complications like dwindling IPv4 resources, sprinkles of IPv6, and constant changes in topology can make them seem almost unmanageable. Tactical networks have all of these complexities with the added pressures of needing to be deployed quickly, by a sometimes untrained and ever-changing staff, and with consequences of failure measured in more important things than lost packets…
The good news is that there are some best practices that you can adopt when it comes to troubleshooting issues within these tactical networks. I’ve developed this list of practices after working with several large tactical networks including some within the US Air Force, the Special Operations Command (SOCOM), the US Army’s Warfighter Information Network-Tactical (WIN-T), the Navy’s Littoral Combat Ship (LCS) systems, and many others and I certainly can’t take much, if any, of the credit as those folks really know their stuff and have shown me many things over the years.
The first tactic that I find to be invaluable when it comes to managing these networks is documentation. Keeping an updated diagram of the network (whether it’s in Visio, in a notebook, or drawn in the sand) is the first and foremost step in speeding up the troubleshooting process. There are some great applications like LANsurveyor that can make this a lot easier.
Next, you need to be able to troubleshoot from different perspectives on the network. If you’re in DC and a user in Dallas is complaining about performance of a web application based in San Jose, it’s going to be hard for you to troubleshoot if your only perspective is that of your NMS in DC. This is where technology like IP SLA comes in. IP Service Level Agreements or IP SLA is a technology built into Cisco IOS that allows you to have the routers and switches deployed in your network. There are some great tools out there that allow you to easily leverage IP SLA - one being the SolarWinds IP SLA Monitor - which is free.
Another best practice is that you need to monitor and analyze your network traffic. Traffic monitoring can typically be done through an SNMP based application like the Network Performance Monitor within the Engineer's Toolset, the Orion Network Performance Monitor, or even an open source application like MRTG. What you're looking for is an applicatin that can easily monitor bandwidth usage on your pipes both in real-time and over time and makes it easy to see trends and analyze history.
For traffic analysis, while there are lots of ways you could go ranging from probes to protocol analyzers, IMO you can't beat NetFlow for what it gives you and how easy it is to start leveraging. I've talked a lot about this in the past both here on the blog and within our tech talks and webcasts so I won't get into a lot of detail - but it's hard to imagine how the idea of a feature available for free within the routers and switches you've already deployed into the network that can analyze who's using your bandwidth and for what doesn't look pretty darn good.
I could talk about this forever, but I know that I don't like reading really long blog entries and I'm going to assume that you don't either. Ping me sometime if you want more data and I'll expand on this topic within a webcast soon.