Why is Agent monitoring so unstable?


Has anyone gone fully Agent based monitoring? We have. We took all our windows servers and put them on agent monitoring and have been having issues since. Here are a few of the issues.

1. Delay in polling. All our Agents are agent-initiated polling not server-initiated. And you can see inconsistences in time on polling information. Should have a poll cycle every two minutes and some time it delays upto 5 minutes before we have a poll cycle. And we could never get an answer why this is functioning like this. 

2. Agent drop out. The agent works until it stops. Then never reconnects until you manually reboot the agents on the monitored machines. 

3. cortex service using up all disk space available. It's the cache files saved in a DB file on the monitored machines will explode and eat up all disk space in a matter of hours to a few days. 

4. On win 2012 R2 agents are highly unstable. And they drop out constant. 

I was wondering if we should just kill the agents and return back to WMI polling. WMI is 1000 times more stable than agents have ever been. And after several tickets to solarwinds issues where never resolved.  The agents should work just as stable as WMI if not even more stable. Yet it's worse than snmp and wmi. Why design a new polling method if it's not going to work reliably?