I’ve recently been tasked with assessing the implementation of a log correlation product for an organization. I won’t mention the product name, but its consumption-based licensing model is infamous throughout IT shops far and wide. I don’t usually do this kind of work, but the organization wants to use it as a SIEM, including the configuration of some automated security alerting. Initially this seemed to be a reasonable request, but the project has me feeling like an IT Sisyphus trying to unravel the world’s largest rubber band ball.  It’s even causing me to reevaluate many of my previously held ideas about visibility.


One of the most surprising aspects to this endeavor is my discovery of an entire cottage industry of training and consulting dedicated to the deployment, maintenance and repair of this product across enterprises.  But if it’s just log correlation, why is it such a big lift? Well, now it’s called “machine generated big data” and you’re supposed to be looking for “operational intelligence.” That means more complexity and a bigger price tag.


I remember the good old days, when all a Unix engineer needed was a Syslog server and some regular expression Judo. A few “for” loops in the crontab and you had most of the basic alerting in place to keep your team informed. But lately log correlation and alerting seems like a hedonic treadmill. No matter how much data you have, no matter how many enhancements to alerting you make, you never seem to make any progress.


Identifying feeds, filtering and normalizing the data, building and maintaining dashboards. Maintaining a log correlation and event monitoring system has morphed into a full time job, usually for a team of people, in many organizations.  But when business is demanding increased efficiency from IT departments, how can they afford to dedicate staff to a task that doesn’t seem to add demonstrable value?


It’s time to take a step back and rethink what we’re trying to achieve with the collection of log and event data. At one time, along with everyone else, I thought visibility meant that I needed everything. But with consumption licenses and all the work that goes into normalizing and extracting intelligence, that's probably unrealistic. Maybe it’s really about getting the right data, that which is truly relevant. Think of it like an emergency room. If I walk in complaining about chest pains, the nurse doesn’t ask me what my mother’s maiden name is or who received the Oscar for best actress last year. He or she is going to use a “fast and frugal” decision tree to ascertain my risk for a heart attack as quickly as possible.


So if we want useful “operational intelligence” we need to reset our expectations, focusing on simple, cost-effective systems that will rapidly produce actionable information about events. We need to find a “middle way” for log correlation and event monitoring, understanding that the perfect shouldn’t get in the way of the good. We may miss some things, but by implementing an easy-to-manage system that works, we won’t miss everything.