11 Replies Latest reply on May 4, 2010 2:48 PM by MarieB

    Very high ping times on NPM server

    btrotter

      In Orion NPM 9.5.1, I notice that many of our nodes are reporting latency times in the thousands of ms.

      If I ping any of those devices from my own computer, it responds within a few ms. Yet if I connect to the NPM server and do the pings from there, it reports back anything from 1ms to 6000ms. It is very random and sporadic. This is only happening from the Orion server that we can tell.

      If may just be a coincidence that it is the NPM server, or it could be something with the software causing it.

      Has anyone seen behavior like this before. I saw one post about someone with Symantec AV having a similar issue, but I have Trend Micro installed, and I did try turning it off and testing it.

       

      Thank you.

        • Re: Very high ping times on NPM server
          ecklerwr1

          One thing to check is the status of your poller on your NPM box.  From web under admin look under details -> polling engines and make sure your poller isn't overburdened.  This is one thing that could cause a problem like this.  Also look at the overall health of the server and OS that NPM is running on in the first place.

            • Re: Very high ping times on NPM server
              btrotter

              This is what I have under Polling Engines (see below).

              The OS is Windows 2003 Server Enterprise x64 running SP2. The hardware is a HP Proliant BL465c G1 with 8GB ram and four 2.8Ghz cores. It should have plenty of power. Watching task manager shows slight spikes in the CPU up to 10%, but most of the time it is barely breathing.

               

              Network Performance Monitor Polling Engines

              Last Database Update Now 

              Web Engine Running Since 5/3/2010 1:53:53 PM   * testing purpose only 

               

              Polling Engine on CLTAAPD3 

              Engine Status   Polling Engine Active 

              Type of Polling Engine Primary 

              Polling Engine Version Engine Version 9.5.1 - SolarWinds Orion Network Performance Monitor v9.5.1 

              IP Address 10.42.10.24 

              Last Restart 4/30/2010 1:54:47 PM 

              Last Database Sync 1 second ago 

              Last Fail-Over Never 

              Elements 2242 

                • Re: Very high ping times on NPM server
                  ecklerwr1

                  The important part to look at looks like this on mine:

                  Polling Engine on :}
                  Engine Status     Status  Polling Engine Active
                  Type of Polling Engine     Primary
                  Polling Engine Version     Engine Version 2010.1.0 - SolarWinds Orion Core Services 2010.1
                  IP Address     x.x.x.x
                  Last Restart     4/23/2010 11:42:46 AM
                  Last Database Sync     Now
                  Last Fail-Over     Never
                  Elements     246
                  Network Node Elements     28
                  Interface Elements     210
                  Volume Elements     8
                  Date Time     5/3/2010 11:09:27 AM
                  Paused     False
                  ICMP Status Polling Index     246 out of 246
                  SNMP Status Polling Index     246 out of 246
                  ICMP Status Polls per second     0
                  SNMP Status Polls per second     0
                  Max Status Polls Per Second     30
                  DNS Outstanding     0
                  ICMP Outstanding     0
                  SNMP Outstanding     11
                  ICMP Statistic Polling Index     473 out of 473
                  SNMP Statistic Polling Index     473 out of 473
                  ICMP Statistic Polls per second     0
                  SNMP Statistic Polls per second     5.5
                  Max Statistic Polls Per Second     30

                    • Re: Very high ping times on NPM server
                      btrotter

                      Oops, sorry, I meant to paste that part.

                       

                      Help Network Performance Monitor Polling Engines

                      Last Database Update Now 

                      Web Engine Running Since 5/3/2010 1:53:53 PM   * testing purpose only 

                       

                      Polling Engine on CLTAAPD3 

                      Engine Status   Polling Engine Active 

                      Type of Polling Engine Primary 

                      Polling Engine Version Engine Version 9.5.1 - SolarWinds Orion Network Performance Monitor v9.5.1 

                      IP Address 

                      Last Restart 4/30/2010 1:54:47 PM 

                      Last Database Sync 1 second ago 

                      Last Fail-Over Never 

                      Elements 2242 

                       

                      Network Node Elements 457 

                      Interface Elements 952 

                      Volume Elements 833 

                      Date Time 5/3/2010 2:02:48 PM 

                      Paused False 

                      ICMP Status Polling Index 2242 out of 2242 

                      SNMP Status Polling Index 2242 out of 2242 

                      ICMP Status Polls per second 0.5 

                      SNMP Status Polls per second 3 

                      Max Status Polls Per Second 30 

                      DNS Outstanding 0 

                      ICMP Outstanding 0 

                      SNMP Outstanding 24 

                      ICMP Statistic Polling Index 3540 out of 3540 

                      SNMP Statistic Polling Index 3540 out of 3540 

                      ICMP Statistic Polls per second 0 

                      SNMP Statistic Polls per second 9 

                      Max Statistic Polls Per Second 30 

                       

                        • Re: Very high ping times on NPM server
                          ecklerwr1

                          You poller looks pretty good considering how many elements you have on it... the next thing:

                          Are you running your SQL server on the same or a separate box?  The next thing I would look at since your issue seems to be intermittent is running perfmon on the server NPM is running on and letting it run for while looking at things like the disk queue length to see if the hardware is straining a the point you see your ms response time jump through the roof.

                            • Re: Very high ping times on NPM server
                              btrotter

                              SQL is running on a separate server. 

                              I guess when I say the problem is intermittent is that it isnt consistent with how it is pining.

                              If I run "ping servername -t", the first 2 pings will come back at 5ms, then the next 10 will be at 5000ms, then the next 1 will be 5ms, then next 30 will be 3500ms, etc etc.

                              I have updated the NIC drivers on the server thinking that might have something to do with it, and checked the network port on the switch for errors. It just doesnt make sense unless it is a flakey network card.

                                • Re: Very high ping times on NPM server
                                  ecklerwr1

                                  So I assume this is causing the pings from NPM to your nodes to show high ms response times for all your nodes response times graphs also then.  Also when you ping then NPM server from your client machine with an -t option do the ping times rise high sporadically also?... I would pretty much assume they would.  It definitely sounds like the issue is with the server NPM is running on and that you've narrowed it down to it.  One way you could rule out the NPM processes as part of the problem from the server hardware underneath it would be to stop all of the NPM processes and then ping -t the NPM server from your client and see if the ping times still rise from your client to NPM server even when the services aren't running... this would rule out the NPM services as part of the problem (ie. something like poller being overloaded, etc.)  I hope that makes sense.

                      • Re: Very high ping times on NPM server

                        When you run "ping -t" from the command line is there an actual five second pause when it reports 5000 or is at the same interval as the ones with smaller numbers?

                        • Re: Very high ping times on NPM server
                          btrotter

                          I wanted to give an update on this. We were able to fix the problem and figured I would post our fix here in the event anyone ever has the same issue. It had to do with the IP Stack in Windows.

                          We ran the following command to fix it:

                          netsh int ip reset c:\resetlog.txt

                           

                           

                           

                            • Re: Very high ping times on NPM server
                              ecklerwr1

                              Btrotter-

                              Excellent so glad to hear you got the problem resolved and that it didn't require to much.  Apparently the command resets a couple registry keys and has the effect of reinstalling the protocol.

                              The reset command is available in the IP context of the NetShell  utility. Follow these steps to use the reset command to reset  TCP/IP manually:

                              1. To open a command prompt, click Start and then click Run.  Copy and paste (or type) the following command in the Open box and then press ENTER:
                                cmd
                              2. At the  command prompt, copy and paste (or type) the following command and then  press ENTER:
                                netsh int ip  reset c:\resetlog.txt
                                Note If you do not want to  specify a directory path for the log file, use the following command:
                                netsh int ip reset resetlog.txt
                              3. Reboot  the computer.

                              When you run the reset command, it  rewrites two registry keys that are used by TCP/IP. This has the same  result as removing and reinstalling the protocol. The reset  command rewrites the following two registry keys:

                              SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\ 
                              SYSTEM\CurrentControlSet\Services\DHCP\Parameters\
                                                  

                              To run the manual command successfully, you must  specify a file name for the log, in which the actions that netsh  takes will be recorded. When you run the manual command, TCP/IP is reset  and the actions that were taken are recorded in the log file, known as  resetlog.txt in this article.

                              The first example,  c:\resetlog.txt, creates a path where the log will reside. The second  example, resetlog.txt, creates the log file in the current directory. In  either case, if the specified log file already exists, the new log will  be appended to the end of the existing file.