I have a setup with dual polling engines. Suddenly, for no apparent reason, one of them might fail to collect data, leaving gaps in the graphs (one dot every hour or so). Normal interval is 10 minutes. I have been through this with support, and they really don't have a clue. I have done all upgrades, I have an external SQL-server and the polling engines have been installed on both quad core blade servers, and now back on normal 3GHz/2GB stand-alone servers.
Restarting the polling engine results in normal behaviour. However, the process takes a long time to shut down, resulting in the error "Can't start service" if I use the restart option. But stopping (and waiting) and starting works.
This is actually quite a severe flaw, because I can loose a lot of statistics just because I didn't check on the polling engine.
Now - since this seems to be an unsolvable issue(?) - are there any ways of monitoring this and triggering an alert of some kind?
xref I don't know how many support tickets I have opened on this issue. SW should have all data and everything they'd possibly want to be able to resolve this.