I have noticed at times that SolarWinds is unable to poll or produces a false positive because it is unable to poll. I was OK with this scenario when it was just TimeSkew that could not poll, or TimeSkew was reporting down on a device. Generally, within the next couple of polling cycles (every 120 seconds) the TimeSkew resolves and reports back just fine.
Last night, I received a notification from SolarWinds, the NetApp Filer was reporting a SP-2 down; this lasted for over 20 minutes. My first instinct, check the NetApp. Wait a minute, if the NetApp had a failure, it would have called home, and I would have received a panic alert from the filer, and a phone call from NetApp. After looking at the NetApp logs, absolutely nothing happened during the time frame that SolarWinds was reporting the device down.
The SP's on the NetApp are in the Unknown devices. I am only polling ICMP. I am crossing VLANS on this monitoring - most all the other devices are on the same "management" VLAN, along with the Orion and Orion SQL server.
I do have the Orion Platform loaded on a single VM (small site) - running NPM, Netflow, APM, Config Manager, IPAM, and Virtualization.
Is it a resource issue? I am slowly growing the monitoring environment - in VM Ware.. I can throw more resources at the Orion server, but it appears to be working just fine at this time.
Any suggestions? Ideas?