Log Management: Your Obvious Choice for Capacity Planning and Optimization. Wait, What?

Recently, I wrote an article titled Life Cycle Monitoring: Why an Ounce of Prevention Is Worth a Pound of a Cure. The great Benjamin Franklin coined the term. In the article, I highlighted the value, efficiency, and logic of putting more time into a proper capacity planning and optimization process for all types of IT environments. Most IT professionals would tell you the first thing that comes to their mind when asked how they use log management tools is troubleshooting. They’d talk about how they use their log management tool to debug applications while in development or how they pinpoint the root cause of infrastructure or application performance issues in a production environment.

So why is this article titled “Log Management: Your Obvious Choice for Capacity Planning and Optimization”? Whether you’re a DevOps engineer/SRE, application developer, level 2/3 support, or the technical owner of critical business applications, your job is to provide available and highly performant systems. You could use your log management tool to help you quickly identify why an application or infrastructure element isn’t performing, and you’d be spot on. Still, you can accomplish so much more than troubleshooting with log management. Wouldn’t it be better to use it in advance of issues and let it provide insights into potential problems before they impact your application, users, customers, or business?

Using log analytics to provide insights into how your application will be performing a week, a month, or a quarter down the road enables you to avoid resource and performance issues. Suppose you already get the value of partnering your time series metrics to find out what is causing application performance issues with infrastructure and application logs to find out why they’re occurring. In this case, you understand the value of leveraging this partnership for planning resource capacity and optimizing your IT environment. Planning for growth, identifying workload behavior changes, predicting future resource needs, maximizing your IT investment, and avoiding downtime are just a few reasons it makes sense to pair your metrics with log analytics.

Log Management 101

Let's take a quick moment to provide the basics of most log management tools. If you’re a logging expert, feel free to take a short nap during this part of the article. Log data is created from a myriad of services, hosts, applications, etc. This data comes from multiple sources, can and will have various formats, and—depending on the environment—can add up to massive amounts of data. Everything happening within the infrastructure, the application, and its services is buried in the log data. Job number one for log management tools is to aggregate these disparate logs into a single system, standardize the format, index them, and parse the log data. If your log management tool is worth its salt, you’ll have all your log data in one place in a standard format, and the data will be set up so you can execute powerful searches to quickly get to the pertinent data. Since you’re normally storing anywhere from a small amount of log data to terabytes—maybe even petabytes—how effectively you store it is also a critical part of log management. Being able to scale for short-term spikes and long-term growth isn’t just important; it also becomes a decision you need to make regarding log retention. How much and how long you retain log data will impact your log management tool's capability for capacity planning and optimization. This is why the tool must do a great job on the front end when it ingests and stores the logs.

Enough, Tell Me How to Leverage Log Management for Capacity Planning and Optimization

Traditionally, time series metrics and logs have worked together to help IT professionals quickly pinpoint areas of poor application performance, assess an application's health, and diagnose a variety of issues, from application errors to misconfigurations and resource challenges. Many companies used to use one or the other. Still, more and more companies realize the power of having the what (metrics) partnered with the why (logs) to maximize their effectiveness in troubleshooting applications and infrastructure. 

Using log management and analytics for capacity planning and optimization takes the same haystack of log data, sifts through it, and provides multiple ways to point out patterns, trends, and insights into what will happen, not just why something has happened. It’s not much different than the use case for time series metrics—it just adds a source of insights. Intelligent alerting, powerful graphs, charts of metrics, logs of infrastructure or application components, and system-level insights can make you aware of a problem before it becomes one. Log analytics provide unique insights you can’t get from metrics. These insights enable you to make better decisions in multiple ways. Log analysis can be one of the best ways to understand user behavior and its impact on current and future resource utilization and application performance. Underutilization is just as bad as over-subscribed resources. A robust log analytics tool will also allow you to set up proactive alerts based on trends. A developer spending more time on improving the application rather than troubleshooting performance issues is less expensive, accelerates application release delivery, and keeps users and customers happy. Sound familiar? The benefits of using log analytics include lowering your costs, mitigating risk, maximizing your IT investment, and reducing business-crippling down or slow systems.

Partnering time series metrics with log events is proving to be a troubleshooting partnership made in heaven; this same partnership can help you avoid future downtime, maximize your IT investment, and significantly reduce the time you spend fighting fires.


SolarWinds® Loggly (software as a service-based) and SolarWinds Log Analyzer (on-premises, Orion® Platform module) are powerful log management and analytics tools you can partner with your time series metrics for troubleshooting, capacity planning, and optimization. Both Loggly and Log Analyzer are available for a 30-day free trial.

THWACK - Symbolize TM, R, and C