Community
- Command Central
- MVP Program
- Monthly Mission
- Blogs
- Groups
- Events
- Media Vault
Products
- Observability
- Network Management
- Application Management
- IT Security
- IT Service Management
- System Management
- Database Management
Content Exchange
- SolarWinds Platform
- Server & Application Monitor
- Database Performance Analyzer
- Server Configuration Monitor
- Network Performance Monitor
- Network Configuration Manager
- SQL Sentry
- Web Help Desk
Free Tools & Trials
Store

NPM 2023.2.1 Alerts Delayed

rcbarr

We have been seeing an issue where alerts are delayed, it's like the alerting engine itself is just backed up.

Other observations

(1) Polling Completion Rate is 100% across 10 pollers.

(2) Database Syncs < 30seconds, worst case

(3) Element Count across pollers < 8800 across all pollers

(4) Changed max pool size (From 1000 to 3000) in the NetPerfMon.db

(5) Cleaned up the AlertObjects table where we had NULL values

(6) Disabled all out-of-the-box canned alerts (had a few stragglers)

(7) Did the normal cleanup that we have ALL done

UPDATE [Limitations] SET WhereClause = REPLACE(REPLACE(REPLACE(CAST(WhereClause AS varchar(max)), '( (', ' ( ( '), '((', ' ( ( '),'))',' ) ) ')
DELETE FROM [Limitations] WHERE WhereClause = '1=1'
DELETE FROM [LimitationSnapShots]
DELETE FROM [ContainerMemberSnapshots]
DELETE FROM [PendingNotifications]
DELETE FROM [SubscriptionTags]
DELETE FROM [Subscriptions] WHERE EndpointAddress NOT LIKE 'http%'

(8) Changed the AlertEngine-OverLoadCounter threshold

SELECT TOP 1000 * FROM [dbo].[Settings]
where settingID IN ('AlertEngine-OverloadCounterTimeWindowSeconds',
'AlertEngine-OverloadCounterThreshold')

Change the AlertEngine-OverloadCounter Threshold from 120 to 150

(9) Disabled and optimized all the AppInsight stuff

(10) Per development

Modify External Component Critical
* Change "External - Component Status (Critical)" / Trigger Actions / Log Alert / File Size from (0, unlimited) to 1MB.

(11) We also found where one of our core applications (big) was not muting the alerts every night when they performed nightly maintenance (this was killing us), fixed this.

Development has been able to recreate the issue and has established this as a bug supposed to be addressed in 2024.1 (our hope).

My question is, anyone else seeing this behavior and if you have, what did you do?

As always, thanks for any input and/or advice.

Find more posts tagged with

swiss

npm

Alerting

Accepted answers

All comments

mesverrum

I'm curious how many alerts are being generated on a typical day for you?

rcbarr

8572 over the last 24 hours

rcbarr

SELECT COUNT(*) AS 'Number of Alerts'
FROM [NetPerfMon].[dbo].[AlertHistory]
WHERE TimeStamp > DATEADD(hour, -24, GETDATE())