Hello, I found out that NPM 8.5.1 SP3 opens lots of connections to the Orion DB on a remote SQL2005 Server. When starting Orion there were about 170 connections open. 150 were caused by the alerting engine.Is that a normal behaviour?
BR
Oliver
Yes. That's normal.
Following theses posts I decided to check to see how many connections I was seeing on our Orion servers.
Orion 8.5.1 SP3 - 168
Orion 9.1 SP3 - 119
The SQL comamnd I used was:
select getdate() as Date,db_name(dbid) as 'Database Name', dbid,Count(dbid) as 'Total Connections'from sys.sysprocesses with (noLock)where db_name(dbid) = 'NetPerfmon' group by dbid
thx :-)
The User Login Rate alarm becomes active when the number of logins per second exceeds a threshold. This can be an indicator of poor application design. Creating a connection to SQL Server is relatively expensive, and coding practices where an application repeatedly connects and disconnects from SQL Server should be avoided.
While reconnecting will not necessarily slow down all users of the SQL Server, it will often result in poor performance for the application that is re-connecting all the time.
Spotlight Quest result.
This sounds like the advanced alerting engine connecting at the interval specified for each alert.I suspect if you have 10 alerts configured with checking every minute, then you would get 10 connects and disconnects per minute.Perhaps not the best application design for Orion users with 100's of alerts setup?
<edit> can SW confirm this?</edit>
This thread is very interesting and brings to mind a question regarding good alert design.
Specifically, as the number of advanced alerts grows on your Orion system, what do you think are some good practices regarding advanced alert design with respect to the setting for Alert Evaluation frequency?
I'm hearing here that if you have 100 alerts with the frequency set at 1 minute, then you will likely have a problem.
So, what are the approaches that the user community uses?
That could be a reason, we have 15 Basic alerts and 24 advanced Alerts configured at the moment (I didn't check the interval for all of them yet)
We have a combination of around 25 enabled alerts currently. Many depend on at least one custom property for classification and alerting the correct department. I intend to add many more - however I am a bit hung up on the database connection errors. I don't see any way to really tell what information is being lost when the database connection fails.
While executing a report in Report Writer, it timed out with a "SLQ connection error".
I did a netstat -abn on my server and found there were 64 open connections to my DB server to port 1433.
44 connections are from AlertingEngine.exe5 from OrionCustomPollingService.exe4 from SWTrapService.exe4 from SyslogService.exe3 from dllhost.exe2 from NetPerfMonService.EXE2 are TCP - TIME_Wait
I tweaked my Advanced Alert Manager settings to reduce the number of checks against the DB, even though I only have 23 enabled alerts.
It seems I have hit the wall as far as scalability goes.I may have to increase the number of connections allowed on my DB server, although this is not the ideal solution.(Note this server is still using Orion V8.1)