This past week I have seen a lot of cases where my application monitors have been going haywire. What I mean by haywire is a period where everything goes down, unknown, then up. Rinse and repeat a for a few hours. Last night I rebooted the server to fix the issue and it did. One other thing that happens is when I go to re-poll nodes that are on the additional poller I get a lot of failed to poll messages.
One thing I am wondering about is additional poller location in this case. Our main server is in the US and we have an additional poller in the UK to maintain all of EU, Middle East, Asia and Aus (small presence in the PAC RIM).
I want to see about regionalizing our config since I feel that this is contributing to that cycle.
Sorry if jumbled thoughts but lack of sleep does that to a person.