8 Replies Latest reply on Jun 30, 2016 5:21 PM by rschroeder

    Diagnosing a Transient Network Issue via VPN

    cbeavs

      Hello! We are new to the NPM family (been using for a couple months). Very much enjoying the product, and loving the new UI. Now I am trying to use it for our most puzzling problem yet.

       

      We have a VPN tunnel to a remote site using a Cisco 1921 at the remote end. NPM shows no issues with CPU/Memory/Bandwidth -- its all very low utilization. However my users report that their connection drops every 30 or 45 minutes for 30 sec, or a minute. Its enough to lose a ping packet or end a VOIP call. What is my best approach for using NPM to try and narrow this down or at least be notified when it is happening? I was reading up on creating quicker polling intervals to see if that would work, but I didn't know if there are any advanced settings that would be helpful?

       

      Thanks so much!

      -Chris

        • Re: Diagnosing a Transient Network Issue via VPN
          john.ta

          I don't think Solarwinds would be able to do anything for you in this instance if it is reporting everything as working fine.  My recommendation would be to setup an IP SLA tracker on remote router and see if it is really having connectivity issues.

          • Re: Diagnosing a Transient Network Issue via VPN
            rschroeder

            Is NetPath a good option for you?  Imagine having a VPN path that passes through multiple providers, and one of them has issues.  Knowing you've got packet loss isn't enough data to show where the problem is, but NetPath seems like it would be the tool to show where the issue lies.

              • Re: Diagnosing a Transient Network Issue via VPN
                cbeavs

                I did try setting up Netpath by using a probe at the remote network to go out to the Internet. I can't get Netpath to show my complete path. I get a better look using tracert - I know after my firewall there is a timeout at the ISP router but then from there I get all the other routers on the way to Google. But none of them show up in netpath.

                  • Re: Diagnosing a Transient Network Issue via VPN
                    rschroeder

                    There are a few caveats for 100% successful NetPath deployment, particularly if going through firewalls and discovering external providers' nodes.  If it's not working as needed, it's possible you may find it helpful to route it outside the VPN tunnel between your NPM server, through your firewall, to the outside interface of the remote VPN box.  This way you'll discover all the intermediate hops between the VPN end points, which are otherwise invisible to NetPath when passing inside the VPN pipe.

                     

                    Also, NetPath requires a few additional firewall ports to be opened:

                     

                    I recently had an issue where users of a VPN were experiencing high latency and 24% packet loss.  NetPath discovered problems between two intermediate providers several hops away from the ISP's on both sides of the VPN.  Now Qwest/Century-Link/ATT/Cogent/CCI are working on the issue for me, instead of me spending cycles with frustrated users who aren't able to reliably transfer files through a VPN.  NetPath really provided the details needed to identify the exact nodes that are dropping packets, many hops away from the two VPN ends.

                     

                    If manipulating your firewall to accommodate NetPath's basic needs isn't an option, make sure you're using a discovery port that's open through your firewall to the outside of the remote VPN termination.

                     

                    If NetPath isn't able to help you, something like PingPlotter may be useful.  It can display information in graphic or text formats, and might be the work around to find and prove problems with intermediate nodes, as I did with NetPath.

                      • Re: Diagnosing a Transient Network Issue via VPN
                        cbeavs

                        Thanks rschroeder! That is good info. So I need to open up ICMP Code 11 on my firewall for return packets to my NETPATH probe?

                          • Re: Diagnosing a Transient Network Issue via VPN
                            rschroeder

                            It's a recommended solution, but may not be required if you use the right port protocol. For example, in my instance with the VPN problem, I selected port 22 (SSH) and changed no other firewall rules.  But I DID choose to discover the path to the remote VPN appliance's external address instead of an internal one, and I DID force the routing to go out a different firewall--which shares the same external subnet for my organization, which kept the comparison as apples-to-apples.  No special rules for ICMP 11 or TCP 43 or 17778 were required to be created just for this.  I've not done packet captures on the outside of that second firewall to verify that these packests/protocols are NOT in play--I can only say I didn't have to build anything new on the firewall to get NetPath running successfully with port 22.

                              • Re: Diagnosing a Transient Network Issue via VPN
                                cbeavs

                                I did end up having to open that port to see the full path. But it is working. Thank you so much!

                                 

                                I have seen some anomalies with the  ISP but not enough evidence yet to give them. I see a 378 ms latency from their last router to mine on the SSH port. However I know there is a gateway router of theirs in the middle but it doesn't even show up on a trace route. So I can't tell if that 378 is just my router (CPU & Mem look good) or if there is an issue with their gateway router in between that NETPATH can't find.

                                  • Re: Diagnosing a Transient Network Issue via VPN
                                    rschroeder

                                    Congratulations!  Nicely done!

                                     

                                    I've already been able to use NetPath to prove to intermediary providers they have a latency and packet loss issue between two routers that were previously invisible to me, when I didn't have NetPath.  Both providers had been passing the buck, pointing their finger at the other one.  NetPath nailed down the specific details with beautiful graphics and irrefutable statistics.

                                     

                                    Without it I'd still be fighting my way down Frustration Lane, with the customer's traffic backed up or lost.

                                     

                                    NetPath is our friend.