Looking for SIEM Love in Some of the Wrong Places?

Good morning, Thwack!

I'm Jody Lemoine. I'm a network architect specializing in the small and mid-market space... and for December 2014, I'm also a Thwack Ambassador.

While researching the ideal sweet spot for SIEM log sources, I found myself wondering where and how far one should go for an effective analysis. I've seen logging depth discussed a great deal, but where are we with with sources?

The beginning of a SIEM system's value is its ability to collect logs from multiple systems into a single view. Once this is combined with an analysis engine that can correlate these and provide a contextual view, the system can theoretically pinpoint security concerns that would otherwise go undetected. This, of course, assumes that the system is looking in all of the right places.

A few years ago, the top sources for event data were firewalls, application servers and a database servers. Client computers weren't high on the list, presumably (and understandably) because of the much larger amount of aggregate data that would need to be collected and analyzed. Surprisingly, IDS/IPS and NAS/SAN logs were even lower on the scale. [Source: Information Week Reports - IT Pro Ranking: SIEM - June 2012]

These priorities suggested a focus on detecting incidents that involve standard access via established methods: the user interface via the firewall, the APIs via the application server, and the query interface via the database server. Admittedly, these were the most probable sources for incidents, but the picture was hardly complete. Without the IDS/IPS and NAS/SAN logs, any intrusion outside of the common methods wouldn't even be a factor in the SIEM system's analysis.

We've now reached the close of 2014, two and a half years later. Have we evolved in our approach to SIEM data sources, or have the assumptions of 2012 stood the test of years? If they have, is it because these sources have been sufficient or are there other factors preventing a deeper look?

  • I regularly (at least daily) look at alerts and/or reports for:

    IDS

    NetFlow

    Authentication

    Probably weekly I look at filtered reports for DNS and DHCP. Other people look at web and file server logs, mainly for operational reasons. The other stuff would be forensics related. We don't have a fully automated way to pull hashes or process audits from end stations today, but we're looking into it.

    I don't find firewall logs to be all that useful unless I'm troubleshooting a problem or looking at forensics; I find I can get better information out of NetFlow most of the time.

  • The question was related to primary data sources used. In 2012, it was mostly focused on firewall logging, web/API logging and database logging. I was interested in whether this was still the case or not... and in either case, why?

    You've got a much more comprehensive set of data sources, so I have to ask. How many of these are you using for your routine analysis versus keeping as historical data for forensics?

  • I'm not sure I fully understand your question, but from the perspective of doing security-oriented network forensics the data sources I use most frequently are:

    Web proxy logs

    NetFlow

    Authentication logs (AD, RADIUS, VPN, etc)

    DHCP

    DNS query logs

    IDS logs

    Web server and file server logs

    Firewall logs

    The things I frequently want but are hard to get:

    Executable file hashes and execution time

    Executable-to-socket correlation (i.e., the network flow TCP 1.1.1.1:30000 > 2.2.2.2:80 was started by foo.exe with SHA1 hash 0x1234etc at 2014-12-02T01:00:00)

  • Never if you ask me, but I don't write the checks. Best that this would be a continued effort - maybe given some time daily/weekly depending on the availability of your drones. Of course that news worthy event will always make these efforts ramp up.

Thwack - Symbolize TM, R, and C