We have five polling engines running in our environment. Appears Poller 4 and Poller 5 hangs and drop connectivity every few hours since performing a solarwinds repair to fix an website exception error two weeks ago. Not sure what may be causing this issue
We usually restart the polling engine server or disable/enable the NIC to regain polling engine services
No issues with our primary polling engine..
Been working with SolarWinds support but no root cause found.
We are on the following SolarWindsOrion Platform: 2014.1.0, SAM 6.1.0, IPAM 4.1, NCM 7.2.2, NPM 10.7, WPM 2.0.1
Found the following error in "Core.BusinessLayer.log"
2014-08-20 14:59:20,278 [Scheduler] ERROR SolarWinds.Orion.Core.Common.DALs.EngineDAL - Error when updating Engine info.
System.Data.SqlClient.SqlException (0x80131904): A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection)
at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection)
at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error)
at System.Data.SqlClient.TdsParserStateObject.ReadSni(DbAsyncResult asyncResult, TdsParserStateObject stateObj)
at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString)
at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async)
at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, DbAsyncResult result)
at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(DbAsyncResult result, String methodName, Boolean sendToPipe)
at SolarWinds.Orion.Common.SqlHelper.ExecuteNonQuery(SqlCommand command, SqlConnection connection, SqlTransaction transaction)
at SolarWinds.Orion.Core.Common.DALs.EngineDAL.UpdateEngineInfo(Int32 engineID, Dictionary`2 values, Boolean updateElements, Boolean interfacesAvailable)
the causes of this are various... in my case I have it caused by threads being killed by the sqlserver because of deadlocks or process-out-of-memory issues (as NPM processes are only 32 bits they crash if they grow larger than 2GB), or corrupt .sdf files.
=> open a support case.
in my case UDT appears to have stabilized and is working well. Not the auto dependency builder is running out of member and crashing.. next thing to fix
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process. Learn more today by joining now.