cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post
Level 11

Alert Engine - Maximum Rules Processed.

Jump to solution

Currently we have somewhere around a little less than 200 active alert rules that the alert engine has to process in our Orion deployment.  I'm wondering if there is a maximum number of alert rules that the alert engine can handle and if anyone knows what that number might be?  I didn't see anything in any of the support KBs that would answer that and have opened a ticket, but I thought I'd see if anyone here has the answer or has hit a threshold or number that may have caused issues..

0 Kudos
1 Solution

So the alerts engine is pretty robust, I've seen hundreds of alert definitions configured and contacts have told me about instances nearing 1,000 definitions. So you are pretty far from any limits.  Where those limits are is really tricky and varies widely between environments.   Think of it as each alert definition is a single query running on the on whatever schedule you tell it, default 60 seconds.

SQL handles lots of queries all day so if your system is healthy then this is just another drop in the bucket.  If your tables are really large and your alert logic is really complex then maybe in aggregate these queries can be hard on the server, but usually it's not a major problem from a system performance perspective.

Operationally there's a strong case for keeping the alerts as simple as you can manage.   You are likely to run into issues like "why does this node trigger 3 alerts when it goes down and that node never alerts?"  when the alert pile gets over-complicated.   It's a really common request for people trying to inventory all their alert definitions and there's nothing very easy available to do that and when someone new takes over the tools they have to learn all the logic and get up to speed.

If you have tools like DPA you could actually identify exactly how much stress your alerts are putting on the system if you want, but I'll say I have never seen an alert actually show up in my top performance impacting queries.

- Marc Netterfield, Github

View solution in original post

1 Reply

So the alerts engine is pretty robust, I've seen hundreds of alert definitions configured and contacts have told me about instances nearing 1,000 definitions. So you are pretty far from any limits.  Where those limits are is really tricky and varies widely between environments.   Think of it as each alert definition is a single query running on the on whatever schedule you tell it, default 60 seconds.

SQL handles lots of queries all day so if your system is healthy then this is just another drop in the bucket.  If your tables are really large and your alert logic is really complex then maybe in aggregate these queries can be hard on the server, but usually it's not a major problem from a system performance perspective.

Operationally there's a strong case for keeping the alerts as simple as you can manage.   You are likely to run into issues like "why does this node trigger 3 alerts when it goes down and that node never alerts?"  when the alert pile gets over-complicated.   It's a really common request for people trying to inventory all their alert definitions and there's nothing very easy available to do that and when someone new takes over the tools they have to learn all the logic and get up to speed.

If you have tools like DPA you could actually identify exactly how much stress your alerts are putting on the system if you want, but I'll say I have never seen an alert actually show up in my top performance impacting queries.

- Marc Netterfield, Github

View solution in original post