A few months ago, we purchased and installed a second polling engine in Calgary. Our primary server is in Toronto. Immediately after installing, I found that the polling engine in Calgary kept stopping, and no matter what I tried, it would run for a few minutes, and then begin a pattern of stopping and starting on it's own. The first thing I did was create a trouble ticket with Solarwinds, and have tried everything they asked - even re-installing, and so far I have had no success in curing this.
Today, I was informed that the development team has found nothing, and now indicate that the problem must be in the environment. First, that is an unacceptable response, tantamount to saying, we don't know so it must be your problem - wich is awfully close to my all time favourite response from every user - problem is unknown, so it must be a network problem!
Has anyone else seen this issue, and do you have any suggestions?
Things I've done:
Updated all files - running 9.5.1 now.
Increased ICMP and SNMP time-out values.
Ran diagnostics and forwarded to SW.
Turned on advanced debugging, re-ran diagnostics and forwarded to SW.
Synchronized time zones because apparently if the server clocks are off by five minutes, database synchronization errors cause undetermined problems.
Decrease rediscovery time on second poller.
Hosted a web meeting so that development could manipulate the polling engines directly.
Installed a patched SNMPv7 library to check for socket errors, callbacks to Standard Poller, generating request ID functionality and recognition of TLV blocks.
Reduce CPU usage to single core.
After each step, I ran diagnostics and forwarded the results of the advanced polling log (turning that feature on and off each time). So now, I'm asking for your help ,as Solarwinds seems to have run out of ideas.