Has anyone come across this error on application templates before?
Connection timeout. A timeout occurred during execution which resulted in the job being canceled.
We have 8 polling engines and this is only happening on one of them. Some templates work fine on this polling engine as well, however A few were green yesterday and now have gone to unknown and are displaying this message.
I do keep tabs on all unknown templates and we do have very few that are actually unknown so I don't think its a stress issue, plus SAM polling is at less then 6%
I have seen similar known issues but none that seem to display this exact error message.
The node is up and I have tested the credentials which all come back as working.
We had similar issue and by going through each post i figured out that it could be related to WMI services, so when I looked into the server services were in 'stopping' state, hence a reboot is required, I tried to restart the services via cmdlet I got the below error and service(wmi) simply will would not restart.
so *.bat file script might be required to do timely restart of the service to avoid such issues in future.
I had this same issue. Found a solution that worked of me. I am using Server 2016 both on the Solarwinds nodes and the host that had the issue. This host was managed using an agent. Found the solution in the another Thwack post. I basically removed and re-added the failing SAM application from the specific node it was failing on. Yes I did lose the historical information for this monitor but in my case this was not an issue for this machine.
Text from Thwack post with solution:
Go through all of the currently assigned applications and remove any that are failing as down or unknown. If you don't want to remove them figure out why they are failing and resolve it. Failures and unknown components take much longer to process because the scheduler has to wait for the timeout before it finally gives up.
Link to solution: Success Center
i just applied the template "Windows Server 2016 Services and Counters" to a few 2016 server and received same result "Connection timeout. A timeout occurred during execution which resulted in the job being canceled."
be interested on the fix was for this.
Anyone got an answer to this. We've just replaced our domain controllers (2008 to 2012 - yes I know but I have no say in things). and are getting this. A fix without reboots would be nice as we can only reboot servers from 07:00 to 09:00 on Tuesday mornings (unless something breaks, then we can do what we like ).
Unfortunately I'm no longer with Comtact, but I'd get them to log you a ticket. James should be able to do it for you. Failing that try deleting the template all together and re adding it.
I'm guessing you have swap templates to the 2012 ones anyway? So the above might be pointless. You can also try installing the agent and seeing if polling the SAM Template through the agent fixes it. Its an easy download and is non intrusive so doesn't need a reboot to work.
Let us know how you get on.
Sorry to hear that. I hope you have something better. Thanks for responding. I am trying the 2012 templates and will run it past James after I run out of ideas. I'd rather not install the agent as these are Domain Controllers and management get a bit funny about installing software (even if it is vital and the techies have said it must be done - it's weird here). I might just do it anyway.
Redundancies on their way here so I might be moving on too (last one in).
Hey Peter, I'm thankfully on to bigger and better things, I had just been there for too long. It was my first IT job. So a change was needed
I will have a think about any other options. I never actually got an outcome from my original post, it just started working which was strange.
OK. I guess congratulations are in order. I've been digging around the DCs and have found lots of stuff not configured. I've set up a custom perfmon data collector and added the items from one of the AD monitoring templates. Just playing around with the collection scheduling and will see what happens. Come back Novell, all is forgiven.
Has anyone found a non-reboot solution for this error..? We are facing this in our infra & it's so weird. I tried few trouble-shooting, but nothing helps.
Can anyone please help me to fix this..?
We're seeing the same behavior on the AD template. Some work without any problems while other just randomly start displaying the mentioned error message. Following this thread while I wait on a response.
Yeah, having the same issue as well only specifically on the Security application template on one DC. Other monitors using the same method (RPC) with the same credentials to the same resource (Event Logs) on the same DC are working fine.
If anyone has specifics on how to resolve this without a reboot, that would be swell!
Unfortunately I never got a straight answer.
After many deleted templates and reboots ours worked and stayed stable. I would be interested thought if anyone knows outright what causes it and how to resolve.
Same issue here, maybe 1/3 of our DCs are showing unknown and the counters will test successfully individually, but testing all of them simultaneously causes the Performance Counters to time out (Windows Service Monitors and Windows Event Log Monitors work fine), usually most of them but frequently all. I'll open a ticket as suggested to do so by jere557 and in the meantime I'll try disabling some of the monitors I don't need right now.
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process. Learn more today by joining now.