This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

alert escalation: best practices?

We have to have several levels of alert escalation: for Tier 1 nodes and app monitors, emails every 15 minutes until fixed, Tier 2 - every hour, Tier 3 - once a day, ideally in the morning (haven't figured out the "morning" part yet). Then maybe we'll have Tier 4 with one-time alerts only, and Tier 5+ - no emails.

Because advanced alerts don't seem to allow for conditional escalation (e.g. based on severity or node / app tier), that means creating a bunch of separate alerts:

  1. Tier 1 node down: alerts every 15 minutes
  2. Tier 2 node down: alerts every hour
  3. etc.

Then I'd need several alerts (one for each type of escalation) for monitored volumes, interfaces, application monitors... That's a lot of alerts to maintain... Is there a way around that?

Besides that, what is the best practice for configuring these alerts for different types of escalation? Create a one-time alert for all types and tiers of servers, and then configure separate email-only alerts for escalation, with no NPM logging? Or a separate alert for each tier, like above?

Thanks!

  • To be honest we have avoided this jumble mess of alerts by pushing that logic into our ticketing system. If you create a custom property for tier then pass that in the email then you might be able to have your ticketing system set the severity or escalation logic based on it. Alert Central can repeat alerts after x time elapsed but I have not seen if it can do this based on an alert or object "tier"

    Anyway- our thought process was to keep Orion alerts template based and generic enough to avoid having more then 300 of them- then our ticketing system already has escalation logic for severity and repeat notification so it was a slam dunk.

    Thanks,

    Christian

  • That makes a lot of sense - thanks Christian. I don't particularly like the idea of setting up a separate alert for each type of escalation, object tier, or type of problem, and outsourcing escalation to a ticketing system makes a lot of sense on several levels. That didn't occur to me before and I'll keep that in mind for the future. That said, we don't yet have a reliable ticketing system that we could pass alert escalation to, so for now, I am limited to what SW can do.

  • You can set up multiple email actions in the same alert definition.  Each can have a delay on execution and an escalation time.  By configuring those, you should be able to accomplish your tiered design.

    alert_escal.jpg

  • Unless I missed something, can't use conditions in trigger actions. E.g. you can't set it up so that "for tier 1 nodes, use this trigger action, otherwise use that one". You have to use separate alerts for that. Unless I am missing something.