Hey their Thwacksters,
Brainstorm with me if you will (thanks in advance!)
I was thinking about alert flood protection, and in the event there's a major outage (say e.g. 1000 nodes go down) or something, perhaps a way to protect the alerting engine from sending out thousands of messages in that scenario. Currently there's a single alert definition for nodes going down and alert users - this wreaks havoc on inboxes and for users that manage many devices.
I'm aware of the advanced conditions and we could build an alert to trigger for large numbers of nodes going down, but I'm not sure how we could protect against the opposite where it may skip creating an alert if there's let's say, 1000 nodes that just went down and teams are already aware of the issue and working it.
Thoughts?
Thanks!