cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Where’d the Website Go?

Why NPM and NetPath?  Learn what happens when people get the dreaded “Can’t reach this page” error and can no longer perform their job functions…and learn how NetPath aided a team of network and security experts in finding and solving this problem.

Greetings, all.  My name is Alex Sheppard and I’ve been around computers, phones, electronics and things that go “beep, whirr, poing!” in the night for going on about 45 years now.  I have been a member of the THWACK community for almost three years and am honored to have been nominated for MVP.  As MVPs, we are always looking for ways to help others and shed some good light on the wonderful products from SolarWinds.  Well, as it turns out, in the wake of the recent COVID-19 pandemic, our company made some changes, and with them came some challenges.  But SolarWinds’ NetPath came running and helped to bail us out in short order!

Our Monitoring Infrastructure

Some of you might know that when I started with my company almost 10 years ago, I quasi-inherited the SolarWinds Orion system that we using to this day.  We have Network Performance Monitor (NPM), NetFlow Traffic Analyzer (NTA), Network Configuration Manager (NCM), and Network Topology Mapper (NTM) installed at present.  In my opinion, NetPath, which is part of the NPM product, is like the eighteen-pound sledgehammer of the networking world.  I’ll explain why in a minute.

Originally, we installed the Orion platform products to help identify issues and improve our network’s performance, as well as monitor and alert on “all the things” (with deference to @adatole ;-).  As many others have recently, our company instituted a “work-from-home” policy, to help combat the spread of COVID-19.  In the days leading up to the implementation of this policy, our IT department had been working diligently on deploying a new “always-on” VPN solution.  And while we had done our very best at testing everything, one business-critical site disappeared one day.  Now, of course, the site was still there but we couldn’t see it.

The Problem

On Tuesday, March 17, I began seeing IMs flying fast and furious about a team of folks not being able to see an external site that was important to their work.  Since I was working from home and using the new VPN, I was able to assist in the troubleshooting.  As typical, we did the first thing everyone should do – try to replicate the problem.  The first thing we saw was:

netpath.png

 

 

 

 

 

 

 

Pretty standard 404-type page.  Could be DNS, could be the web server, could be the network, could be nearly anything – but at least we validated it wasn’t an isolated incident.

Troubleshooting

After a few other checks, we validated that the site was not down, as folks could reach it from outside our new VPN solution.  I got in touch with an analyst in our security group and he and I began to truly troubleshoot.  I fired up NetPath and created a service that gave visibility to the path that packets were taking to get to the site.  It turned out that traffic was stopping on an internal step, and NetPath provided a clean, easy to understand description of the problem.

netpath1.jpg

This is exactly why I call NetPath the “eighteen-pound sledgehammer” in our troubleshooting arsenal.  With its seemingly magical ways of peering into nearly every device on a packet’s route, NetPath gives us the capability of knowing with great certainty where performance problems lie.  If the MPLS provider has a problem, we know about it.  If another downstream provider has an issue, we know about it.  To me, this tool alone is worth the price of admission!

Network route is Good, Keep Troubleshooting

After the BGP problem was addressed, NetPath also discovered that the Top-Level Domain (TLD) of the external site was being blocked by our firewall, so the security analyst took care of that quick, fast, and in a hurry.  Once all the phases of troubleshooting were in place, our IT teams worked quickly to tick all the boxes and get the that team back up and running.  (I forgot to grab a screenshot for this error, but trust me, it was just as easy to understand).

Importance of Good Tools

None of our troubleshooting technically required any specialized tools.  We could have found the BGP and firewall blocks in a more traditional way: via lookups, testing, and using the CLI.  But if you are facing a big problem, and need results quickly, maybe that eighteen-pound sledgehammer is just the tool you need.

Tags (1)