3 Replies Latest reply on Nov 21, 2013 11:35 AM by zackm

    Unreachable Status with Advanced Alert Rules

    Mike Lomax

      I am having trouble making a certain set of conditions work and wanted to see if someone might have a suggestion.

       

      So the scenario is that I am using a customer Advanced Alert Rule, not the delivered alert rule, to monitor up/down status of a node.  The trigger conditions are working fine but I am having an issue with the reset conditions.  Rather than just checking for the Node Status to be Up, I am checking the following:

       

      Reset Alert when all of the following apply:

      Node Status is not equal to Unknown

      Node Status is not equal to Down

      Node Status is not equal to Warning

       

      I was doing this because I did not want the alert to reset if the Node Status was either Up, Unmanaged or Unreachable.  This was working fine.  However now I have decided that I don't want triggered alerts to reset if the Node Status is Unreachable.

       

      To provide an example, let's consider a dependency with Node01 as the Parent and Node02 as the child.  To start, Node02 is Down but Node01 is Up.  Then Node01 goes down and Node02 goes to Unreachable.  I don't want Node02's triggered alert to Reset which it does do with the above conditions.  Instead I want the alert on Node02 to remain in force.

       

      In other words, when Node01 comes back up, I still want Node02 to be triggered and then the situation with Node02 can be re-evaluated on the next poll.  But I don't want Node02's pre-triggered alert to reset just because Node01 joined the list of down nodes and is it's Dependent Parent.

       

      One would think that I could just add Unreachable to the list above making the new list read:

       

      Reset Alert when all of the following apply:

      Node Status is not equal to Unknown

      Node Status is not equal to Down

      Node Status is not equal to Warning

      Node Status is not equal to Unreachable

       

      However when I configure the conditions this way, the alert for Node01 does not reset when it comes back up.  Of course, then when Node02, which is still down, also comes back up, its alert also does not reset.

       

      Can anyone explain to me why that is?

        • Re: Unreachable Status with Advanced Alert Rules
          zackm

          Top of my head:

           

          Reset Alert when ALL of the following apply:

               Node Status is equal to UP

           

           

          I'm not sure I follow the logic here. If you look for an UP status, when would you not get a valid alert fire/reset condition?

            • Re: Unreachable Status with Advanced Alert Rules
              Mike Lomax

              Thanks for the reply @zackm. You do raise a good point.  The reason for my logic was that I did not want to automatically exclude any new statuses that SolarWinds might add in the future.  For example my earlier thinking about Unreachable after it was introduced was that I did want the previously triggered Alert to Reset if the related monitor went to Unreachable.  However now that the new state has been around a while, I have changed that opinion and now feel the opposite which is the reason I am trying to make this change.

               

              But when you step back and look at my logic it is really a cup-half-full or cup-half-empty situation.  I am not automatically including new status values in my Reset evaluation but there also might be a reason that I absolutely want to include them as well.  Either way of looking at it might still result in the need for an adjustment to the conditions which I was trying to limit.

               

              While the logic is debatable and changing the logic as you suggest would likely resolve my problem, this really sidesteps why this proper condition design does not work.  The statements should work even if you do not agree with the logic.  I am looking to understand why they do not.

                • Re: Unreachable Status with Advanced Alert Rules
                  zackm

                  Fair enough. I would think that a problem like this would be due to the dependency of the child to the parent. It almost seems as though the parent has 'Unreachable' somewhere in relation to the child, which might be required for dependencies to work, but would break your query as is.

                   

                  I would suggest opening a ticket and seeing what SW comes back with. This is interesting.