All,
Current SAM version 6.2.2. We have one additional polling engine.
Frequently (several times a week) we see hundreds of Applications go into an Unknown state but just on the applications where the node is on the Additional poller. Some of the time they eventually clear on their own. Some times we have to restart our entire Solarwinds environment to get things to clear. Needless to say this is less than optimal way to monitor our environment. Solarwinds support doesn't really help answer why this is happening when we call. They just fix the instance of the issue, close the case.
As I write I have 227 applications in an unkonwn stated. Here are some of the templates and the error messages:
APP: IIS Admin Service Monitor
ERROR: Unexpected error occurred. A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: TCP Provider, error: 0 - No connection could be made because the target machine actively refused it.)
APP: SQL Server (via SNMP) on
ERROR: Unexpected error occurred. A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: TCP Provider, error: 0 - No connection could be made because the target machine actively refused it.)
APP: Time Zone Change Server 2003 - Security Log 520 logged
ERROR: Unexpected error occurred. A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: TCP Provider, error: 0 - No connection could be made because the target machine actively refused it.)
The TimeZone and IIS Admin app errors seem out of place as these application don't contact a SQL Server, unless it cannot contact the Orion db which seems unlikely.
I can get them to clear but it involves opening the application and Testing each component. With hundreds to test, I usually just wait for things to clear.
I am really at a loss as to why this keeps happening. Any help would be most appreciated.
Thanks.