4 Replies Latest reply on Apr 26, 2018 9:12 AM by ajiwanand

    Alert Escalation and Supression questions

    ajiwanand

      I want to clarify how alert esclation levels work if no one acknowledges the alert.

       

      Say I have an issue with a node and escalation levels are as follows:

      Level 1 - Restart node

      Wait 10 mins

      Level 2 - Email escalation contact

       

      Say my contact has not acknowledge. Will it recycle the escalation and start rebooting the node again? I am trying to have it only try to reboot once rather than contentiously reboot if it sends emails that might not be an issue but would be great if we can control this

       

      Secondly, If i have an application thats flapping - so the application goes down, alert actions happen but then it comes up 10 mins later. Then goes down again and repeatedly cycles this action.

       

      Is there a way to supress alerts/send email in this situation so that it doesnt send numerous emails or restarts numerous times? So basically if alert fires x times.

        • Re: Alert Escalation and Supression questions
          nglynn

          Each action is fully configurable from that perspective.  Under 'Execution Settings' for each action you can specify if you want it repeated after X number of minutes or not.  You can minimize your flapping by adjusting the amount of down time and up time required for your alerts and clears.  This is a good practice to rule out false positives.

            • Re: Alert Escalation and Supression questions
              ajiwanand

              There's only two options and both is around the alert being acknowledged. What happens if the alert is not acknowledged? My current setting is the default (Do no execute if acknowledged) but what happens if the alert is never acknowledged and the trigger action keeps happening. Does it recycle the escalation levels? or stay on the last one?

               

              I am trying to figure out if it will happen if not acknowledged. Since I am doing expensive operations like restarting application/services/nodes I dont want it to be repeating if the alert wasnt acknowledged.

               

              Also, I am trying to minimize this happening as there are instances where we have actual issues that may happen with applications that flap and I'd like to not have the application restart everytime it goes down for 100 times if it hasnt been caught yet. (Obviously not an ideal situation and shouldnt happen but id like to understand how to stop it if it does)

                • Re: Alert Escalation and Supression questions
                  nglynn

                  My understanding is that each action will function independent of one another based on the execution settings.  So if you said you wanted the first action to be your service restart, and you wanted it to re-trigger after 10 minutes until the alert is acknowledged(or cleared of course) it would re-attempt that.  In your case what I have done in the past is I have the service restart setup and I generally don't touch the execution settings.  Meaning that the first action will only trigger once.  Then on my 2nd escalation which is a notify I could see more potential/value for it to be configured to repeat the action until acknowledged(if that is some form of escalation notification.)

                  1 of 1 people found this helpful