18 Replies Latest reply on Oct 18, 2012 10:00 PM by byrona

    Stateful Log Alerts?

    byrona

      I am curious if it's possible to use LEM to create state based events out of logs which generally are not state based.

       

      As an example...

       

      I want to a log that comes in to trigger an alert.  I want that alert to continue to send out alert emails every 30 minutes until the alert is re-armed.  I want a different defined log to re-arm the alert.

       

      In my specific case these are Fortinet VIP out-of-pool events.  I need to trip an alert when a log comes through indicating an IP has dropped out of the pool and re-arm the alert when the log comes through indicating the IP has been re-added.

       

      Is this currently possible in LEM?

        • Re: Stateful Log Alerts?
          byrona

          I am currently engaged in a project that requires me to find a solution for this if at all possible so I would really love some feedback here.

          • Re: Stateful Log Alerts?
            nicole pauls

            We do have the concept of a state variable. You would probably need:

            1. A rule that detects the event and sets the variable the first time it sees it to "ON" (the "action" of the rule would be to modify the variable)
            2. A rule that detects the event you want to be notified on that checks to see if the variable is "ON" and sends email if it matches (the action of this rule is to email)
            3. A rule that detects the closure event and sets the variable to "OFF" (the action of the rule would be to modify the variable)

             

            You might be able to modify the first two to be one rule, depends on whether there are 2 events or 3 or how you want to visualize it. You can create state variables from Build > Groups (they can contain string or number variables). The action you want is "Modify State Variable" under Actions.

             

            I am pretty sure this will do what you need, but if I need to set up an example to make it clearer, let me know.

            1 of 1 people found this helpful
              • Re: Stateful Log Alerts?
                byrona

                That is awesome!

                 

                So to take it a step further is it possible to have some sort of action take place at a specified interval as long as the state variable is in a specified state?

                 

                For example: Email me every 30 minutes as long as state variable = down?

                  • Re: Stateful Log Alerts?
                    nicole pauls

                    Hmm, rules have to be triggered by an event (as it stands), so you'd have to have some kind of event that comes in to fire it off of.

                     

                    Not exists rules do let you do something like:

                    If you see the event get generated

                    and you don't see this other event that should come after it within 30 minutes fire an email.

                     

                    Thresholds do let you re-infer, but right now you can't re-infer on a threshold of one event, and it's not quite the same thing. If it were multiple events, threshold re-inferences would let you do:

                    If you see 2 of these in 30 seconds, fire the actions associated with the rule

                    keep firing that alert every 30 minutes if the condition persists

                     

                    It's not exactly the same thing because it's the event condition that is persisting (i.e. being above threshold), rather than the state variable.

                     

                    One way would be to pick an alert that is fairly common, but you'd want to be careful that it wasn't something SUPER high volume or every time that occurs it has to be parsed, sometimes at scale that bogs things down.

                      • Re: Stateful Log Alerts?
                        byrona

                        Hrm...

                         

                        Here are the details of my specific use case...

                         

                        I receive a log that indicates a system has been dropped out of a VIP on a Fortinet.  That then triggers a state variable to be "down".  As long as the state variable is "down" I want an email every 30 minutes reminding me that it's down.  Eventually I will receive a log that indicates the system has been re-added to the VIP at which point the state variable will be set to "up" and I will stop receiving emails.

                         

                        This is what I would like, would it be possible with LEM... or anything else you are aware of?

                         

                        P.S.  We just recently purchased LEM so I don't have much practical experience with it yet so these may be simple things I don't yet know about the product.  I also realize that this use case is a bit crazy; however it's what I have been tasked with. 

                          • Re: Stateful Log Alerts?
                            byrona

                            I called support and was informed that what I am trying to accomplish isn't possible.  Specifically the re-alerting every 30 minutes.  This is because LEM is designed to notify only on changes or events that occur, not on a repetitive basis due to something being in a specific state.

                              • Re: Stateful Log Alerts?
                                nicole pauls

                                Well, the state variable lets you store the state, but a rule can only fire based on an alert happening. So, it's the "every 30 minutes" part that makes this difficult, without some kind of alert that happens every 30 minutes. We're looking at ways to make scheduled nDepth searches and alerts triggered from them, which would make this possible in another direction (based on a search, rather than real time rules, which I think will accomplish this goal - you'll see this in other systems that can do scheduled searching/alerting with fairly good query language). You can also store time in a state variable, but you can't do math in the rule (only greater than/less than), which also doesn't help.

                                 

                                I wonder if there's a creative way to solve this problem, though, with a mix of alerts that do regularly fire and the state variables mentioned above.

                                 

                                I'll chew on this and circulate it to the rules guys to see if they have any ideas.

                                 

                                PS: thanks for following up the results of your support call.

                                  • Re: Stateful Log Alerts?
                                    nicole pauls

                                    Hey Byron, we think we've got a way to accomplish this, but would like to test it. In order to get as real-world as possible, can you provide copies of the alerts and/or syslog entries that should set and clear the condition?

                                     

                                    Thanks!

                                      • Re: Stateful Log Alerts?
                                        byrona

                                        I would be happy to provide these; however, to protect the guilty I don't want to post them here in a public place.  Do you have a method that I could use to send them to you directly?  It doesn't look like you have Direct Messaging turned on for your Thwack profile.

                                          • Re: Stateful Log Alerts?
                                            nicole pauls

                                            The word from the thwack team is that we have to be friends in order to exchange DMs. The other way to do this is to create a "Private Discussion" between just us (Create > Discussion > Private Discussion). You can also email me at firstname dot lastname @solarwinds.com too.

                                              • Re: Stateful Log Alerts?
                                                byrona

                                                I have sent you an email with the logs, thanks for looking into this for me!

                                                • Re: Stateful Log Alerts?
                                                  byrona

                                                  Did you have a chance to use the logs I sent you to test the idea you and your team had come up with?

                                                    • Re: Stateful Log Alerts?
                                                      nicole pauls

                                                      Good news, we've tested our theory and confirmed it worked, we're cleaning it up and getting copies of everything so you can see/implement them.

                                                        • Re: Stateful Log Alerts?
                                                          byrona

                                                          Sweet, I can't wait to see what you have for me!

                                                            • Re: Stateful Log Alerts?
                                                              nicole pauls

                                                              We built this in such a way that you could test it end-to-end without just using your syslog events. That made it a lot easier for us to trigger, and means that you can test it without your events to get the flow down without figuring out how to send syslog messages on-demand, then create a version with your events that should behave similarly.

                                                               

                                                              The Rules

                                                              1. "Step 1 - Trigger BEGIN Event": This is a dummy rule used to trigger the watch state, only used for testing. If you want to test this without using real syslog data, this rule generates an alert that causes the chain to begin.
                                                              2. "Step 2 - Rule 1 (Find out if device is down and BEGIN watching)": This is the rule that detects the down state and starts the timer. This is the one you'll want to use and put your own syslog data info in (whatever the name/type of alert and any other criteria is for the "we're down, start notifying me every 30 minutes" alert).
                                                              3. "Step 3 - Rule 2 (Keep watching and send mail on proper interval)": This rule is just used to fire every 30 minutes as long as you're still in the "watch" state.
                                                              4. "Step 4 - Rule 3 (Find out if device is UP and STOP watching)": This is the rule that detects the up state and stops the timer. This is the other one you'll want to use and put your own syslog data info in (whatever the name/type of alert and any other criteria is for the "we're back up, stop notifying me" alert). You might also want this one to notify you and let you know the condition has been cleared.
                                                              5. "Step 5 - Trigger STOP Event": This is another dummy rule used to trigger the stop watching state, only used for testing. If you want to test this without using real syslog data, this rule generates an alert that causes the chain to stop.

                                                               

                                                              Helpful Filters

                                                              As a part of testing, our QA team found these filters helpful. They are built around the rules above, but help you track EVERY STEP of the way. They are pretty self explanatory:

                                                              1. My Device is Down (matches the example's "down" alert)
                                                              2. Rule 1 Activity (shows activity from the rule marked 1 above)
                                                              3. Rule 2 Activity (... 2 above)
                                                              4. Send Email (shows email activity)
                                                              5. Rule 3 Activity (shows activity from the rule marked 3 above)
                                                              6. My Device is Up (matches the example's "up" alert)

                                                               

                                                              Testing with the Example

                                                              1. Create a state variable - the example uses one called "Event" with a single Text variable called "NO" (it will be set to "NO" when the rule should not fire, and set to "YES" when the rule should fire... you can of course create your own things that might make more sense than setting a variable called NO to YES which took me a few minutes to figure out, but you'll need to keep them straight through the example ). (We would provide the group for you to import, but there's an issue preventing group import at the moment)
                                                              2. Import rules (Gear on the right side > Import - you can ctrl+select to import them all at once).
                                                              3. Edit the imported rules
                                                                1. Step 1: Just needs to be enabled and saved.
                                                                2. Step 2 - Rule 1: Select the state variable you created, even if it's called Event you'll need to re-select it. Into the "NO" field (if you're using the example) drag a Text Constant and type the text "YES". Enable and save.
                                                                3. Step 3 - Rule 2: Select the state variable you created, even if it's called Event, and the field you created, even if it's called "NO", and drag it over the one in the correlations box to replace the "identical" one. (This has to be done because the one that's there is just a placeholder, that's why the rule has an "Error".) Configure the Email Alert to send to whatever user you want it to send to, or if you are content with the filters, remove the email message action. Enable and save. Click yes on the warning, it's warning us about the possibility of an infinite loop, which is kind of our intent here, the state variable will cancel it out.
                                                                4. Step 4 - Rule 3: Select the state variable you created, even if it's called Event you'll need to re-select it. Into the "NO" field (if you're using the example) drag a Text Constant and type the text "NO" (anything but YES, really). Enable and save.
                                                                5. Step 5: Just needs to be enabled and saved.
                                                              4. Import the filters if you want to track, or create your own (import for filters is on the Filter Group/left side gear > Import).
                                                              5. To trigger the "BEGIN" event, edit a filter as the "admin" user and click save.
                                                              6. Enjoy (it will take 2 minutes to enter the loop, then fire every minute in the example)
                                                              7. To trigger the "END" event, run an nDepth search as the "admin" user (doesn't matter what or how long, just go to Explore>nDepth and make sure a search runs).

                                                               

                                                              Modifying for your Environment

                                                              You need a copy of the 3 rules (step 2, 3, 4, marked rule 1, 2, 3). Modify rule 1 to match the initial event you want to detect, modify rule 2 to be the interval you want to notify on using the response window and correlation time, and modify rule 3 to be the cancellation event you want to detect.

                                                               

                                                               

                                                              And, may the force be with you.

                                                    • Re: Stateful Log Alerts?
                                                      byrona

                                                      I should also note that I opend a Feature Request for the functionality that I need HERE and I opened a thread to better understand the use case for State Variables HERE.  Both of these threads were based on my understnding from this thread.