This is part three in this series. In part one and part two, we covered some of the basics. But in this post, we will dig into the benefits of application performance monitoring (APM), look at a few examples, and begin looking at what APM in an Agile environment means.
With everything we have discussed thus far, it may or may not be apparent on what the benefits of APM might be. Hopefully they are obvious, but in case the benefits are not clear, we will discuss the benefits briefly.
Based on some of the comments for the previous posts, it seems that there is a common theme: it is not so easy to accomplish. This tends to justify why many choose to either not start or quit trying when looking at APM.
I would personally agree that it is not an easy feat to accomplish, but it is very much beneficial to stick with it. There will be pain and suffering along the way, but in the end, your application's performance will be substantially more satisfying for everyone. Along the way you may even uncover some underlying infrastructure issues that may have gone unnoticed but will become more apparent as APM is implemented. So, in regards to the greatest benefit, I would say it's the fact that you were able to follow through on your APM implementation.
Let's now look at just a few examples of where APM would be beneficial in identifying a true performance degradation.
Users are reporting that when visiting your company’s website, there is a high degree of slowness or timeouts. I am sure that this scenario rings a bell with most. This is more than likely the only information that you have been provided as well. So where do we start looking? I bet most will begin looking at CPU, memory, and disk utilization. Why not? This seems logical, except that you do not see anything that appears to be an issue. But because we have implemented our APM solution, in this scenario we were able to identify that the issue was due to a bad SQL query in our database. Our APM solution was able to identify it, show us where the issue lies, and give us some recommendations on how to resolve the issue.
Now, let us assume that we were getting reports of slowness once again on our company’s website. But this time our application servers appear to be fine and our APM solution is not showing us anything regarding performance degradation. So, we respond with the typical answer, “Everything looks fine.”
A bit later, one of your network engineers reports that there is a high amount of traffic hitting the company’s load balancers. A DDoS attack is causing them to drop packets to anything behind them. And guess what? Your applications web servers are directly in line with the affected load balancers. Which would explain the reports that you received earlier. In this case, we did not have APM for our application configured to monitor anything else other than our application servers, so we never saw anything out of the norm. This is a good example of not only monitoring your application servers, but also all the external components that are in some way related to what performance is experienced with your application. If we had been doing so, we at the very least could have been able to correlate the reports of slowness with the high amount of traffic affecting the load balancers. In addition to this, our APM was not configured to monitor the connection metrics on our application servers. If we had, we should have been able to notice that our connection metrics were not reporting as normal.
Conclusion of After The Fact Monitoring
If you recall, I mentioned in the first article that I would reference the traditional APM methodology as “After the fact implementation.” This is more than likely the most typical scenario, which also leads to the burden and pain of implementation. In the next post of this series, we will begin looking at implementing APM in an Agile environment.