19 Replies Latest reply on Jun 1, 2012 5:14 PM by jswan

    Gee Whiz or Gee Whatever….

    Mrs. Y.

      This past week I saw a presentation delivered by a colleague on a very well-known log correlation product which had just been implemented in my

      division. I’ve been a longtime fan of the software’s capabilities and have a lot of respect for the person who deployed it. If you configure

      it properly, I’m even convinced you could warp the space-time continuum. However, right after hearing the talk, I started receiving some

      completely cryptic nuisance email alerts from said product for some hardware my team wasn’t even responsible for managing. My “gee whiz”

      admiration wore off about the time I had to delete the 50th email message. This led me to realize a certain truism of log correlation and

      event monitoring. The ratio of implementation time, resources and ongoing maintenance should never be greater than the usefulness of the

      product deployment. My “gee whiz” experience had devolved into “gee whatever” as I begged someone to just turn off the barrage of additional,lo

      useless alerts. Why do you need a whole team of dedicated people to coddle and cajole a product into performing some rather mundane tasks? I

      don’t want to have to get a PhD in a product to make it work the way it’s supposed to.

        • Re: Gee Whiz or Gee Whatever….
          mdriskell

          This is the same sort of tricky pattern matching that I have to deal with in some of our SNMP trap monitoring that we do.  For example we use NETAPP for our storage arrays and the SAN team is responsible.   By default all NETAPP traps are sent to the SAN team.  However we had to do custom rules to send alerts on specific volumes to specific teams.  It becomes a very big administrative nightmare to keep up with and one minor error in the pattern match can cause traps to go to the wrong teams. 

           

          It's much easier when the device owner is also the responsible party so I can simply send all alerts regarding a device to one destination but that is far from the case on much of the gear we monitor.

          • Re: Gee Whiz or Gee Whatever….
            jswan

            It seems like sometimes (a lot of times?) the marketechture in the logging world mixes up the different roles of alerting, reporting, monitoring, investigation/troubleshooting, and forensics. Too often, stuff from all the other categories gets dumped into alerting.

            • Re: Gee Whiz or Gee Whatever….
              byrona

              I read this post yesterday and went home last night and spent some time thinking about it as well as some of your other posts regarding Syslog and Event correlation and I ended up asking myself "what would l like to see in a log management system" and here is what I came up with...

               

              DISCLAIMER: What you are about to read is not nearly a fully baked idea, more just me thinking out loud and I am completely willing to accept that these ideas may in fact be total rubbish!

               

              • Easy to install
                • I personally like virtual appliances as I would prefer to not have another piece of hardware to manage
                • If I need to pay a team of people to install it then I don't want it
              • Easy to maintain
                • It should include all of the necessary maintenance processes to keep from destroying itself if I don't go "manage" it every day
              • Pre-Defined correlation rules that I can understand
                • Since there are teams of people typically configuring these types of systems why can't they just create the correlation rules once and put them into some sort of centralized online database where I can use them
                • Give the rules names that make sense to a normal person such as "Active Directory Account Locked Out" instead of come cryptic log message
                • Group the pre-defined rules using something like tag groups
              • Support the major compliance requirements (HIPPA, SOX,, etc)
                • I want out-of-the-box correlation rules that will support the major compliance requirements so I can just go turn them on versus spending ours trying to understand what the compliance requirements are and then building rules to support them
              • Continuous improvement
                • The company that makes the product should continually improve and add new correlation rules to the online centralized database as part of my annual maintenance fee
                • Allow the community to add to this database of rules as well
              • Don't make the product crazy expensive
                • It's already difficult enough to get executives to see the value in this type of product
                • For some reason most log management products that claim to support the compliance requirements are crazy expensive

               

              These are just my ramblings of what I think would be nice to see in a product.  I am sure some if not all of this has been done or at least tried before; however, I can't say I have every seen it done well.

                • Re: Gee Whiz or Gee Whatever….
                  Mrs. Y.

                  Can we turn your list into an RFC? Definitely time for the boiling frogs to jump out of the pot.

                  • Re: Gee Whiz or Gee Whatever….
                    mdriskell

                    I need to steal your Disclaimer and add it to every thing I have ever or will ever say in my lifetime. 

                    • Re: Gee Whiz or Gee Whatever….
                      familyofcrowes

                      Wow, aren't you describing LEM?  Most of what you say here is why I am in the process of begging for money to buy LEM.  We have used LogLogic for years but, LEM is cheaper and is SOOOO much easier to use, setup, train and support....

                       

                      cant wait to implement all your ideas!! 

                       

                      Oh yea...  ditto on your disclaimer....  :-)

                        • Re: Gee Whiz or Gee Whatever….
                          byrona

                          Well, only having done a one time 30 day demo of LEM back on an older version I can't confidently say one way or the other.  Back then I had some fairly harsh critiques about the product that I posted in Thwack which resulted in an hour long phone conversation with SolarWinds where I provided feedback for their development of the new version which just recently launched.  I am eager to get my hands on the new version and should point out how awesome it is that SolarWinds is so open and responsive to responding to customer feedback.

                           

                          About 5 minutes ago our VP of Operations and Engineering had a conversation with me regarding a customer of ours with HIPPA requirements for whom we need to implement a logging solution and since we use and love the SolarWinds products LEM is first up to bat.  = )

                           

                          I will keep you posted!

                          • Re: Gee Whiz or Gee Whatever….
                            byrona

                            Just as a follow-up.  I just finished evaluating LEM for the project that I am working on and have concluded that it will not work due to it's significant lack of flexibility.  The system uses connectors to normalize data (which makes total sense) so even if you are sending syslogs into the system, it can only monitor them if it has a connector for the very specific log type.  In our case customers have lots of different log types that they want us to monitor that they send via the syslog protocol including custom applications and LEM is unable to monitor these logs because there are no connectors for them.

                             

                            Since normalizing the data is important and thus the connectors are important, it seems like it would make sense to provide a development environment with a documented pattern matching schema so that you can write your own connectors for syslogs similar to how you can create your own pollers in Orion; this was feedback that I provided to them.

                              • Re: Gee Whiz or Gee Whatever….
                                jswan

                                We had the LEM product for a while before it was acquired by Solarwinds. The problem you discuss is one of the reasons we dropped it.

                                  • Re: Gee Whiz or Gee Whatever….
                                    byrona

                                    What did you end up using that would accomplish this successfully?

                                    • Re: Gee Whiz or Gee Whatever….
                                      jswan

                                      Right now we are using a combination of Solarwinds Kiwi Syslog and some custom Python scripting. We are looking at Splunk and a new open source framework called ELSA. However, the latter is Linux-only which is an issue (unfortunately) for my management.

                                        • Re: Gee Whiz or Gee Whatever….
                                          byrona

                                          I am familiar with Splunk and I have never talked to a person that uses it and actually likes it so I have tried to avoid it.  I also found the the logging solution provided by ManageEngine (which can be found HERE) probably does exactly what I want but having previously evaluated their products I have found their support to be terrible and I would feel like I am turning to the dark side.  Aside from that, I am struggling to find something that meets my needs.

                                            • Re: Gee Whiz or Gee Whatever….
                                              jswan

                                              I have used Splunk on my laptop a lot with sample data, and it works well for my needs so far, but I'm a little worried about the steep learning curve.

                                               

                                              I used a different ManageEngine product for a while and it was so unstable and unsupportable that I wouldn't go back to them.

                                               

                                              Logging is one of those areas where there's a very low barrier to entry in the marketplace and everybody's needs are slightly different, so it seems like you always end up with something that's heavily customized. Splunk's extreme flexibility and extensibility is what makes it appealing to me, but it does seem to come with a learning curve.

                                               

                                              What kinds of requirements to you have in your environment?

                                                • Re: Gee Whiz or Gee Whatever….
                                                  familyofcrowes

                                                  We use loglogic today, but the interface is fairly difficult. it serve the purpose though. We have an LX2010 that is now going end of life.

                                                  • Re: Gee Whiz or Gee Whatever….
                                                    byrona

                                                    We are looking to support compliant environments (HIPPA, PCI, etc).  We need something that at the very least will accept Syslog though it would be nice to have the ability to capture other logs as well such as IIS.  We need Syslog because it provides a standard way we can offer customers to send us their logs for any custom applications they may have.  I need to be able to write rules to match against those logs that trip alerts.  I also need reports that I can run for forensic purposes.

                                                     

                                                    We are a service provider and I am working on this as log management service for a specific customer; however, we would like to design it in such a way that it's a repeatable service for more customers going forward.

                                                     

                                                    Kiwi isn't a terrible fit (I have it running in an eval lab now); however, I am not sure how it would scale with a lot of rules.  Kiwi also doesn't have any reporting to speak of for forensic purposes and the WebUI suffers from very poor response time. 

                                                     

                                                    I really wanted LEM to be the solution and am a bit bummed.

                                                      • Re: Gee Whiz or Gee Whatever….
                                                        jswan

                                                        Kiwi is very fast and rock solid, but it's management interface leaves a lot to be desired. I only have a few alert rules in Kiwi and I run all my reports with home-brew Python scripting. I think it would be a pain to manage at large scale. Splunk will definitely do all the stuff you want. One problem is that it gets expensive fast for really large log volumes (it's priced per GB/day of indexed logs). You could do the alerting with syslog-ng if you are a Linux guy, but that would leave you with finding a way to do custom reporting.