Many years ago I would get the groups on our overview map go yellow, and then clear as Solarwinds had problems polling the nodes.
I reported this to the network engineers, who dismissed this as a solarwinds polling issue, and I got steadily more frustrated.
It started happening more frequently, and sometimes bits of our network would now show RED as Solarwinds would flag odd devices as being 'down'
I reported this to the network engineers, who dismissed this as a solarwinds polling issue, and I got even more frustrated.
It got worse and worse until someone really looked at the problem and realized a network policy device that was thought to be capable of 40Gbps throughput was only capable of 20% of that in the function/mode we were using it in, and the campus network was teetering on the edge of a meltdown...
SOOOO.... my suggestion here is to carefully figure out the network path between NPM and the devices giving false positives to see if there is something common on the path that might be not performing quite as well as you hoped and is not reporting its imminent demise.
[also: what is your VM resources, and where is the database located in case it really is the VM server]
Database is on its own VM
Orion VM Ware resources:
20 Gb RAM
200 Gb storage/hard drive
Orion SQL VM Ware resources:
6 Gb RAM
600 Gb storage/hard drive
I am in charge of everything ... so I don't have to contend with Network naysayers or Server naysayers .. it is on me!!! Do you think crossing the VLAN's might be the issue. Kind of jives with the Time Skew issue I see. I have all networking equipment being managed / polled via a management vlan. The Data Center and "servers" are on a different VLAN.
I do appreciate your feedback! Every push helps!
It should not be -- I've thousands (4954) of VLANS I monitor equipment across. I would look at the router that connects your vlans first to see if it is having problems.
My database server is physical -- I'll leave discussion of that to someone who might have more input on that.
So ... Data Center to Core Switch - 10 G back-plane- error free- tight config (really proud of the Network - Solar Winds tells me so) ..... so back to the SQL server... looking at the stats I sent you .. would you add additional resources to the SQL server? I noticed right away when I brought up the Orion Application Server (Physical to Virtual) that I had to throw more resources at it. Now it performs so much better than it did on a physical box. Whaduthink? I am definitely not a SQL or database person, but do have their support when I ask! I don't see the situation surface often ... I HATE FALSE POSITIVES! I know I can make it better! Really proud of the Solar Winds Deployment - my site is ROCKING! ... cept for a slight SW glitch every now and then!! I don't see this issue building up like a runaway roller coaster like you described above.
Thanks again for your input!