Network administrators have never been confined to working on networks alone. A huge number of small and medium size organizations prefer to have a single admin to take care of both the network as well as the applications as part of their cost cutting measures. But even with one person taking care of it all, enterprises ask for 99.9% uptime!
Application monitoring can sometimes become a nightmare for admins. To ensure application performance, one needs to keep an eye on both server as well as the application in addition to the network. You need to track CPU, disk performance, system load, memory, packet loss, response time, etc. on the server side and then database connections, running processes and threads, service status, number of queries/second and more about the application. It also does not end with collection of stats alone – You also need to able to correlate the stats collected from different elements and quickly spot issues before the users start complaining. Working based on guidelines alone may not be always effective and can be time consuming. Let us look at the factors that can help ensure better application performance.
Real time visibility into critical elements:
Business critical applications like CRM, CITRIX, ERP, etc., need continuous monitoring of both server and application elements. Critical applications are used by hundreds of users across the organization and there will be processes like adding, modifying or deleting data, updates, backups, etc. running at all times. To ensure uptime, you need to make sure that the server is never overloaded. Any lack of resources on the server will cause bottleneck for the application making the application slow. Hence a holistic visibility into your critical application servers is necessary and that too in real-time.
How can you ensure zero downtime? Proactive monitoring is the answer. You will need a monitoring tool with good analytical and alerting functionality. You should not end up waiting for end users to pinpoint the problem. Analyze applications and their normal behavior and set thresholds that can alert you before a change in behavior turns into a problem. And make sure the alerts can reach you at all times wherever you are. There are also monitoring tools which set thresholds automatically based on behavior and alert based on those thresholds. And when you create your alerts, remember not to overdo it and find yourself with a flooded mailbox.
Learn from Outages:
Outages are bound to happen in spite of the best effort you have put in. Understand the reason why an outage occurred and why you did not see it coming. Most admins leave their monitoring tool as it is after an outage believing the tool did not see it coming. That is always not the case - look at the thresholds and alerts you created on your monitoring tool and try to understand why a scenario must have been missed. Create alerts so that if such a scenario replicates, you are not left lost. You also have to think about problems that can occur and create alerts based on it.
Read your Reports:
Reporting helps not only identify problems but also helps understand baseline behavior of your systems. With knowledge on baseline behavior, you will know when something is nearing failure. Remember to go through your everyday reports and ensure you are getting reports on critical factors about the application as well as the server hosting the application.
Application maintenance and uptime can be easy if you remember the factors we discussed. Not only do you need to monitor your important assets and critical factor, you also need to have an understanding of normal vs problematic behavior. This can be further made easy if you have the right tool for server and applications monitoring in your network.