Top 6 SANS Essential Categories of Log Reports 2013 in LEM

Question

SANS released an updated list of their critical log categories recently. Some good recommendations especially if you're new to log management.

The 6 Categories of Critical Log Information

How easily can these be achieved using LEM?

Can the LEM team include them in the LEM ready made filters as a new filter group for example?

OT, SANS also had their top 20 critical security controls last year. I think it's a good marketing opportunity for Solarwinds to show how their products can be used to achieve these controls.

http://www.sans.org/critical-security-controls/

FormerMember · Accepted Answer

There wasn't really a good way to answer this without taking notes on all of them, so here's some (relatively) quick thinking... let me know if you want to drill into any of these or need more specifics somewhere.

Authentication and Authorization Reports
- All login failures and successes by user, system, business unit
  - On the reporting side, LEM has a User Logon by User and a User Logon Failure by User Report. The normal Authentication Report includes both types of activity, and while there is a breakdown by event type, the report data is organized by time.
  - On the filtering side, you could create User Logon and User Logon Failure filters, then create a "by user" breakdown, or a "by source machine" chart breakdown. You can click on the bars/lines/datapoints in the charts to narrow stuff down, and you can send the same filter to historical search if you want to see what's going on beyond the scope of real-time (though reports might be more well suited, depending).
  - If you're using Windows, you might want to pay attention to the Logon Type - logging on interactively is a lot more meaningful than a network logon. You can filter reports to exclude those logon types after you run them (or save as a custom report so that filter's always present), same in filters.
- Login attempts (successes, failures) to disabled/service/non-existing/default/suspended accounts
  - If you can get this going, it's well suited to alerts more than reports. The key to this as a report or alert is knowing what those accounts are. If you have a group of them in AD it makes it easy to create an alert/filter for that activity, but even better on the service/admin account side would be to have a naming convention for these types of accounts.
  - We have an out of the box "critical account logon failures" correlation rule, but it's not too bad to create your own. You'll likely want to create a user-defined group of account types/names to match, then build an alert that looks for those accounts in the DestinationAccount. Again, you might want to pay attention to the Logon Type, too.
  - Might be good as a scheduled search since it's a narrower view than the above Giant Logon/Logon Failure Report, which is more well suited to reports.
- All logins after office hours / “off” hours
  - Again, this might work well as an alert. LEM has Time of Day Sets that you can use with filters/rules to only alert or see data when it occurs outside of business hours.
  - On the reporting side, if you've got the big report, you can filter it down to just show a more interesting time range. A scheduled search would work pretty well, too, the data comes out in a CSV and you can slice it however you want.
- User authentication failures by count of unique attempted systems:
  - This would be pretty easy in a chart in a filter so you could monitor it in real-time. I think part of the objective with some of these is to baseline how often failures occur, so maybe after you get that baseline then you could set up some alerting. If you wanted to do that historically, using a search is probably your best bet for data flexibility.
- VPN authentication and other remote access logins (success, failure)
  - On the reporting side, take any of the authentication reports and refine them by either the DetectionIP/Tool Alias to only be from your VPN, or by the username convention (if you have one). You can save that as a custom report to run on its own, too.
  - You could do the same thing in a real-time filter as long as you know similar criteria. If it's a username pattern you're looking for, use DestinationAccount.
- Privileged account access (successes, failures)
  - Probably good for alerts, if you have good separation of duties (i.e. people use an admin account), but would work really well as a saved search or real-time filter, too.
  - If you're using an AD group for admins, or a naming convention, this is easier. Reports doesn't integrate with AD groups, but a username filter pattern is pretty easy. Otherwise, even on the reporting side, your # of admins is likely relatively limited.
- Multiple login failures followed by success by same account
  - Good use of correlation rules - set a threshold of failures then look for a success:
    - (UserLogonFailure (5 in 5 minutes)) AND
    - UserLogon AND
    - UserLogonFailure.DestinationAccount = UserLogon.DestinationAccount
    - <send email>
Systems and Data Change Reports
- Additions/changes/deletions to users, groups:
  - A lot of out of the box content here - look at the Change Management filters, rules, and reports. The Resource Configuration report tries to include ALL change management, the others are more broken down. The rules are pretty distinct, too.
- Additions of accounts to administrator / privileged groups
  - This would be a subset of above activity, and probably well suited to alerts. If you have fixed admin groups or names (like Domain Admins, Administrators, etc), create a rule using the NewGroupMember event and look for group name in one of those groups.
  - As a report, you could filter the above report, or use a saved search, or use a real-time filter. (I'd still go for the alert myself, people shouldn't just be added to admin groups willy nilly!)
- Password changes and resets – by users and by admins to users:
  - There's an out of the box rule that looks for failed password changes, but you could easily tweak it to include successes if you wanted to do this as an alert.
  - Alternatively, should be covered in the above change management reports (usually User Modify Attribute)
- Additions/changes/deletions to network services:
  - Here it depends on how good your logging tools are and what you need.
  - If you're auditing the registry using some kind of file integrity monitoring tool, you can see additions to the services area of the registry, which could be insightful. Probably good for alerts, but will also create events in the File/Object Auditing area (lots of File and Object Audit reports as well).
  - If you're using process auditing to audit for launched processes as the SANS link discusses, on servers you could create a list of "known good" processes and alert or generate an incident or suspicious activity event when a process not on that list is started. Harder on workstations and other systems, though.
  - USB-Defender will also pick up when someone plugs in a USB network device, these alerts will show up in the USB-Defender filter and related reports.
  - Some DLP solutions will also log unexpected activity here, or maybe using the Windows Firewall but that often tends to just create more noise than anything.
- Changes to system files – binaries, configurations:
  - Again, depends on your tools. Windows does have some native "oh my, something in my sensitive configuration directory just changed!" built-in, but it's pretty tame. If you're File Auditing changes to system32 or something, you might want to be very specific since Windows File Auditing can get relatively noisy.
  - This would probably be a good one to start off with auditing and using historical search/reporting, then once you've got the auditing narrowed down, you could trigger an alert if you see it occur. You might have to use thresholds, I'm not sure how easy it's going to be to tune this down to only a single event that can easily trigger an alert without being super noisy.
  - On the reporting side, most of this activity would also be in the File Audit reports.
- Changes to other key files:
  - This is similar to the above, but you'll want to audit things that might be system or service specific.
- Changes in file access permissions
  - More variations on the theme - you can use Windows File Auditing to only look for metadata/permissions-type changes.
- Application installs and updates (success, failure) by system, application, user
  - Only as good as the data here as well. MSI installs ARE logged by default to the Windows Event Logs, along with some events around Windows Updates, but you might need to do some clever process auditing to catch other installs since not everything uses an MSI. We have had customers use process auditing and alerts/filters/reports that are targeted just at "setup.exe" "install.exe" and similar filenames, often with wildcards ("setup*.exe" or "*.msi").
  - Both of these are in the Machine Audit reports, the sub-reports are relatively self explanatory (Process Start = launch, Software Install/Update = installs).
  - There are some out of the box rules for software install/updates, meant primarily for servers who should have a maintenance window or shouldn't be installing software out of band (or are in scope for compliance where you REALLY have to track that stuff).
Network Activity Reports
- NOTE: in reading through these, I'm afraid you could spend a lot of time here. Focus on what's most important and easiest to track first (known bad activity), then look at trends, then start looking into the rest of the data or you'll probably get overwhelmed
- All outbound connections from internal and DMZ systems by system, connection count, user, bandwidth, count of unique destinations:
  - LEM is pretty limited in the ability to do bandwidth, but the other parts of this will depend on how this activity is logged. For firewall and router ingress/egress data, you can use the Network Traffic Audit reports to break this down on the reports side - there's also some By Source and By Destination reports that are helpful.
  - This might be good to watch in real-time as well, you can make real-time charts that slice the data by source/destination/system and choose how the data is displayed/aggregated.
  - This could be a lot of data. You're going to be looking for more trends and unusual activity than you are going to be able to consume it all.
- All outbound connections from internal and DMZ systems during "off" hours:
  - In a real-time filter, you can use Time of Day sets to refine the above just to business hours (or outside business hours). With reports, once you've got all the data, you can slice it however you want.
  - This data might be tough to consume, volume-wise. You need to be sure you know what you're looking for.
- Top largest file transfers (inbound, outbound) OR Top largest sessions by bytes transferred:
  - Unfortunately LEM and the log data we capture isn't really well suited to this type of activity. Kind of the bread and butter of netflow. We do have some basic flow analysis in LEM, but it's never going to be as good as a real flow analysis tool like NTA.
- Web file uploads to external sites:
  - Here you're as good as your data - if you're using a proxy server, you should be okay. Use the Web/File Transfer reports to look for different types of activity from different sources. It could be a lot of data, so you might want to look for suspicious transfers - servers sending data, create a list of known good sites and look for stuff outside of that, look for different file extensions. There's not a lot of
- All file downloads with by content type (exe, dll, scr, upx, etc) and protocol (HTTP, IM, e-mail, etc):
  - Probably primarily going to be coming from your proxy server also. Create a list of extensions that you want to track, drop them in a User-Defined Group, and use them in a real-time filter, possibly even an alert (especially for servers).
  - On the reporting side, use the same reports, but refine them by their URL to only match those types as well. Would also work relatively well in a saved search since you really hope this is going to be a short list.
- Internal systems using many different protocols/ports:
  - On the server side, this might work by creating a list of known communication ports for your devices, then using firewall/router egress logs (blocked or allowed, though hopefully blocked, right!?) to match against what might not be on your list. You could build an alert with a threshold so that if too many ports are opened outside your list repeatedly then you get alerted.
  - On the reporting side of LEM, you can use the Network Traffic Audit by Source/Destination reports then filter by Source/Destination on top of that. (Core Traffic is generally stuff coming from firewalls, unless they are application-aware, then it might be in Application Traffic as well)
- Top internal systems as sources of multiple types of NIDS, NIPS or WAF Alerts
  - This is interesting - I see what they are going for, but I'm wondering what a good way would be to go about it. Maybe creating a filter for only alert activity for those sources (by DetectionIP/ToolAlias), then slice that data by machine. As an alert, I could see setting a threshold (same machine tripped 3 distinct sources of events within a relatively short time period - if you did that, you'd want to make sure you specified which sources of events were valid for that threshold otherwise one machine could easily trip that across just the event logs, for example).
- VPN network activity by user name, total session bytes, count of sessions, usage of internal resources
  - This might depend on what you're using for a VPN and how good the logging is, and how that's intertwined with your firewall/router data. If you're using a web VPN, you might be able to tell what apps are launched and that might be a pretty good picture of activity. If you're using a standard PPTP or IPSec tunnel, though, a lot of the traffic just might not be tracked. You might want to use this as a lens to look at the other activity though, depending on what you see, then set up specific reports if they are useful (e.g. if your VPN has a specific IP range, take any of the network traffic or other activity reports, filter it just for activity from those source IPs, and see what you can see. If it would be useful in this context, THEN set up some specific reports or alerts)
- P2P use by internal systems
  - You might be able to pick this up with process auditing, too (launching skype.exe, for example - we've had people watch for the launch of IM processes and kill them with a rule in LEM).
  - From a traffic perspective, if you've got this activity being logged, you would want to filter the activity reports by port/protocol, or if you've got something more application aware, you might be able to use that.
  - If it really is something that's not allowed, you might be able to alert on usage, and hopefully it won't trip a ton of alerts since it's blocked (right?!).
- Wireless network activity
  - Depends on your log sources here - but on face this could be similar to the VPN network activity above, where you might refine by IP address of other activity and see what you can see.
- Log volume trend over days
  - LEM has a few maintenance reports that can show you general volume. They are right, some customers have used this to spot when suddenly their volume is way out of whack. You might want to watch this in real-time, too. Check out the Database Alert Statistics and Database Maintenance Report.
Resource Access Reports
- Access to resources on critical systems after office hours / “off” hours:
  - In all practicality, this may not differ a lot from looking at the authentication activity, unless you want to track specific applications, then it'd be up to what those applications log.
  - It's always a good idea to separate critical systems and monitor their after-hours activity specifically - processes launched after business hours, outside of maintenance windows, by users that shouldn't be accessing those systems directly. (You'd still see their logon activity first - and with LEM could log them off, but if you wanted to do it activity-based you could wait until they do something suspicious first).
- Top internal users blocked by proxy from accessing prohibited sites, malware sources, etc
  - These will show up in the Web Traffic reports, filters, searches, etc. All of the blocked activity will say "Blocked" in EventInfo, and the events should have the IP address and URL in them. If your proxy categorizes, you could do it by category as well.
  - Malware may come in as specific VirusTraffic events, which would be in the Network Attack reports more likely than anywhere. You might test this by using the EICAR test over the web and see where it's triggered and what the events look like.
- File, network share or resource access (success, failure):
  - This overlaps with some of the above reports where we're talking about file auditing, but the difference is that you'll want to audit files that are specific to your environment - payroll, sensitive data, whatever. As always, being SPECIFIC with file auditing will make it the most useful, otherwise you'll be buried in events that you can't even tell are useful.
  - Once you do get that data in, the File Audit reports, and related rules/filters/searches, will help refine the data.
- Top database users:
  - All of these database tracking reports really depend on what/how you're auditing your databases. LEM does include MSSQL Auditor and support for Oracle Auditing, but you might also have database monitoring tools that are better suited for this (or a more true database activity monitoring solution that LEM supports, so the data is in LEM).
  - In LEM, these events will likely be normalized into UserLogon events, so you could use the UserLogon by User report and then look to refine by the DetectionIP or ToolAlias for your database server(s)/monitoring tools.
- Summary of query types
  - Again, depends on your auditing tool - likely these events will be in the ObjectAudit area of LEM and subsequent related reports. It might take some intelligence about what your database is used for to make these useful.
- All privileged database user access
  - This might be more well suited to alerting, or at least a pretty well refined report. If you use the authentication/logon reports, filter to just the database, and just privileged users.
- All users executing INSERT, DELETE database commands
- All users executing CREATE, GRANT, schema changes on a database
  - There are some rules within LEM that look for this activity when you're using SQL Auditor, those would be a useful guideline.
- Summary of database backups
  - You MIGHT get this from audited activity - might depend on the tool.
- Top internal email addresses sending attachments to outside
- All emailed attachment content types, sizes, names
- All internal systems sending mail excluding known mail servers
  - These email ones - I'm not sure where you'll get this activity logged. The exchange transaction logs aren't really great. I don't know if a tool like LOGbinder for Exchange can help distill that info, but it might be promising.
- Log access summary:
  - Definitely audit LEM internal activity - use the Authentication - TriGeo Authentication (oops, need to make that LEM) Report, look for InternalUserLogon alerts, and look for InternalAudit activity.
Malware Activity Reports
- Malware detection trends with outcomes
- Detect-only events from anti-virus tools
  - Most of the virus activity will be in Malicious Code reports, but these are usually good candidates for alerting, too, especially left alone (we have an out of the box rule that shows how to look for just left alone malicious code, not just all viruses, though sometimes you might want to know that a particular user has a fondness for viruses).
- All anti-virus protection failures
  - You'll probably want to track the services being stopped and/or processes being exited (if you're process auditing), along with the updates (which will come in as SoftwareUpdate, in the Machine Audit reports).
  - On the real-time monitoring side, you could also set up a filter and some charts for all AV activity by the ToolAlias, which would let you then break it down by the type of event.
- Internal connections to known malware IP addresses
  - If you do have a public malware blacklist, you can import it as a user-defined group via a CSV file, then probably hook that up to an alert, as long as you trust the list for low/no false positives. (You could use LEM rules to infer a security alert or create an incident instead of setting up an email alert straight up, too, then comb through that much smaller list.) UDGs can also be used in real-time filters and searches.
- Least common malware types
  - Once you've got the other reports/filters/searches, hopefully you'll be able to comb through them, whether it's top 10 or bottom 10. Amusingly, once I started monitoring and asking people where they downloaded viruses, viruses started getting less common and when they happened people would start volunteering information - people actually started policing themselves out of fear that I would know anyway and ask them.
Failure and Critical Error Reports
- Critical errors by system, application, business unit
  - Generally this activity will come through as SystemStatus or ServiceWarning alerts, which are going to be in the Machine Audit reports (you can find them by their name pretty easily). There's also some out of the box filters that look for generic Windows errors. You might want to apply some internal logic to this, though, and make sure that you're monitoring the critical events important to you.
- System and application crashes, shutdowns, restarts
  - Application crashes might be harder to detect (sometimes there's residue in the windows event log), but system shutdowns and restarts are relatively clear.
  - Be sure also to track stuff like services stopping, which can indicate crashes without a crash report or other telltale sign.
  - Both of those would also be in the Machine Audit reports, and especially in the case of servers would make really good alerts and not just reports to filter through.
- Backup failures
  - Depends on the tool - LEM does integrate with some backup logs and will log failures.
- Capacity / limit exhaustion events for memory, disk, CPU and other system resources
  - Here you have a couple of choices - if you set up performance counters in Windows, you can have them generate event log events, which LEM will pick up and can alert on. Windows also has some built-in disk thresholds for automatic event log alerting that the disk is getting full (we have a corresponding out of the box rule on that one). Your other choice is to use a performance monitoring system like Server & Application Monitor and send alerts from there to LEM, or use that system to give you alerts THEN investigate related log data.

Top 6 SANS Essential Categories of Log Reports 2013 in LEM

Top Replies