The Role of a Network Admin in Application Performance Management

     When it comes to application performance management, the main focus is the application. That is the main concern for end-users. They do not care about network performance, server performance, or any other metrics that can be measured. They only care about the application they are trying to use. Development teams and server admins are usually the main parties involved in the monitoring and management process for applications, but the reality is that all of this runs over a network. To take performance management to the next level, it only makes sense to bring all parties to the table.

First Steps: The Internal Network

     When examining the role of the network in an application’s performance, the first step is the internal network. If your application is internal, this may be the only step to focus on for your application. Whether it is east\west in your server and user environments or north\south to your firewalls, network monitoring needs to be involved. Let’s see if this sounds familiar...

Application is having issues -> Dev team receives a trouble ticket -> They consult with the server team to find the root cause -> Once all of their options are exhausted, the network team is consulted as the next step.

     After this long process, when the network team is finally consulted, they realize a small issue on the uplinking switch. After a quick fix, everything is back to normal. That is where the issue lies for a lot of environments. A tiered approach adds unneeded steps to the troubleshooting process of application performance management. I like to think of it as almost a hub-and-spoke environment with the application being the hub, and each spoke is a supporting team.

Untitled Diagram.png

     By doing this, all parties are included in the process of application performance management. Each of them could be the first alerted party if there is an issue, and address the problem directly. This sets a good base for ensuring uptime and optimal performance for an application.

Monitoring External Network Performance

     Creating a strong structure to application performance management is the first step. Once this is mastered on the internal network, the network team has an additional step of monitoring the external network performance. This is why it is crucial that the network team is included in monitoring applications. For example, there could be a routing issue between ISPs causing latency for external users accessing your application. Server analytics would show performance within acceptable tolerances and the development team would not see any errors either, yet users could be having a poor experience using the application. If proper steps were taken to monitor the external network, this issue could easily be detected, resolved, and communicated to all affected users. One example of managing the external network is toolsets that allow for multiple endpoints to be used for testing all over the world. Communicated stats could be anything from ping latency to overall bandwidth from all of these different locations.

The fact is that utilizing the network team in application performance management is a no-brainer. Reducing troubleshooting and problem resolution times are a couple of things that any technical team could get behind. Next time you are planning a management and monitoring structure, be sure to focus on the network as well as the application itself.

  • Seems like so often an issue would be reported, and when I investigated couldn't reproduce.  And then learned that I could reproduce, but only with IE or Windows. In one particular case, I chased an issue quite a while with that finding and ultimately learned that Internet Explorer was refusing a cookie due to an underscore in a URL.  In this one case, IE was following whatever RFC applied.  Microsoft actually followed a standard, and it resulted in a load balancer cookie being refused.

    Ironically that load balancer (Cisco CSS) was finicky about the type of browser used to manage it.  IE6 required with Java 6u23 if I'm not mistaken.  Ugh!  Kept a virtual machine around for management until I realized CLI was actually easier for management.

  • Sometimes I would get into hosts at both ends of a conversation and look at netstat output to demonstrate a connection from IP address X on port Y (getting all the way across the network).  In Windows, adding the -b switch was helpful to show this is the application that generated the traffic.  Everyone always asked for packet captures when their application did not work as expected, but couldn't describe what normal looked like.  Often times the packet capture was overkill or not as easily obtained as everyone hoped.

  • I've worked in several environments where the immediate response was "It's the network" and you spend a whole lot of time testing that theory before anything else can be done. I use that phrase because that's my approach. So often us network guys will get offended and set out to prove it's not the network. I always assume that I'm working as part of a team and that we are all just looking for the solution no matter where it lies.

    With SolarWinds in the toolbox it's best to get all the players - dev, applications, engineering, systems and yes even the customers - looking at the tools for trends, issues, conditions, etc. With everyone having access to data there is better opportunity to collaborate and look for solutions rather than finger pointing. The more people buy into the monitoring the more helpful they will be. After all, when you think about it all of the parts and pieces interact and can affect each other. The more we improve individually, the more we improve as a group - the more we improve as a group the more we each improve individually.

    Can you tell "Team" means a lot to me!

  • Interesting article. This is a real challenge for the seasoned, set-in-his/her-ways, network admin. Myself, I ran into Support after I dropped out of college when I realized that my brain was not built to be a developer. The convergence of Development and Operations is widely documented and advertised. It is inevitable. Network Admins need to know their role in the new world order.

  • I've been trying a new approach, when there's an issue I call it a "Infrastructure Problem". I've been trying to train our helpdesk and our new NOC to call it that rather than a "Network Problem".  I've had some success, but unfortunately management, vendors, engineering and operations still say there's a network problem. My take on it is rather than pointing the finger at the network, like everyone in IT does, say there's a problem in the infrastructure.  By doing this, you're addressing there is a problem in your environment,  you're being IT agnostic, you're not putting one team against the other and when it turns out to be a network, server or a database problem then you can correctly call it what it is.     

Thwack - Symbolize TM, R, and C