0 Replies Latest reply on Jun 26, 2018 4:15 PM by kwilsonhisd

    Ping problems after upgrading

    kwilsonhisd

      We're seeing some very odd issues with NPM here. First, a bit about our environment:

      Over 7600 nodes
      We use 5 additional pollers as well as main poller
      Server 2016 main poller:
      VMware virtual server with 24 CPU, 64GB ram.
      NPM 12.3

      UDT 3.3.1

      NCM 7.8

      VNQM 4.5

      IPAM 4.7

      NTA 4.2.3

       

      Database server: Server 2016 with SQL 2016

       

      Since ugprading all the applications on 6/15, we've found that on 6/22 the server hung. After resetting it, many nodes began showing as down. We found that they were up if pinged from another machine. Then we realized that ping itself is timing out or not working on the main poller. If we stop the solarwinds services, ping starts working again. Trying to isolate an individual service causing the issue didn't find the culprit. It seems like it might be the job engine service, but stopping it by itself doesn't get ping working again, and we need it running regardless.