So I have written feature requests asking for things like being able to report a non working poller and such, but after a few failures with some services and pollers I really think the program needs some way to monitor itself. Sure there are things you can watch with APM down services and such. But in my cases I have services that run but do not work.
I am not sure how this can be done I would imagine with all the stats in the DB it could be based off of that, pollers not updating, jobs not being completed but I think we really need a canned health monitor set for Orion itself.
There are just to many holes right now in the system, I basically have to monitor manually daily at this point.