1 Reply Latest reply on Nov 11, 2015 9:19 AM by cscoengineer

    "Remove the White Noise"

    smiffy85

      Hi all,

       

      I have recently stated looking at our LEM installation for the Security Team. They have done some basic configuration and logging including deploying the agent and seeing what LEM can do.  They have reached a point where they are being overwhelmed by the data that is being recorded from the 10 nodes with the agent installed.

       

      We need to find a way to remove this White Noise so that they are able to see the picture much clearer whilst retaining information for a specified period of time without compromising performance.

       

      What do we want -

       

      Via GPO we reduce what Logs are generated on the Servers

      Servers generate their logs

      Using LEM we want to take a subsection of these (minus white noise)

      Within LEM we want to only be able to see he events that matter to us

       

      As it stands we are generating millions of events in seconds and on only 10 nodes.

       

      Performance isn't great, but when you add in the 500 nodes we will be monitoring (if not 1500) the LEM appliance will grind slowly to a halt.

       

      This may just be a training issue and not a limitation.

       

      If any of you Experts out there are able to help I would really appreciate it, they are already talking about looking for other SIEM options, despite my protests

       

      LEM 6.1 installed, will be upgraded once we can fix the above issues

        • Re: "Remove the White Noise"
          cscoengineer

          To get an accurate count of number of events.  Look at the all events filter, then send it to nDepth.  This will give you the number of events per ten minutes.  Times it by six to give you events per hour.  According to the LEM scability guide, with the default of 2 vCPUs and 8GB memory - LEM can do 1.3Million per hour.  If your number is more than that, I would bump up the vCPUs to 4 and memory to 16.   There are other metrics to consider, such as size of the temp file and the queuing on the LEM - but that comes into play if you performance is really slow.

           

          Did you apply the recommended default audit policies and the domain controller policies?   These are documented in the admin guide.  Once this domain policy is applied, use the "auditpol /get /category:*" on the monitored server to see the actual audit policy in place and verify that it matches up with the expected policy pushed by group policy.  I have seen many situations where the windows infrastructure was not designed properly and the servers would randomly not receive the GPOs.

           

          If you have WFP (Windows Filtering Platform) turned on - turn it off.  It generates lots of noise.  It can even crash the LEM.  You can tune it out - but the best way to eliminate it at the source.

          http://knowledgebase.solarwinds.com/kb/questions/3263/LEM+Manager+crashes+after+receiving+a+high+number+of+alerts+from+Windows+7+or+Windows+Server+2008

          http://knowledgebase.solarwinds.com/kb/questions/6128/Tuning+out+Windows+Filtering+Platform+on+LEM+and+on+Windows+Agent

           

          Previously in a 10 node situation (3 DCs and 7 regular servers), I saw about 120k events per hour.

           

          Look at the type of events coming in, and attack them from the highest occurrence,   For example, if the highest number of events involve WFP - eliminate that from the source.  If the highest is from a firewall, see if the syslog is set to debug and set it to the appropriate level (notification).

           

          That should get you started and hopefully improve performance.   Now getting Reports to work in a timely manner - that's a different story. 

           

          Thanks

          Amit Shah

          Field Engineer II

          Loop1 Systems