Is UX Monitoring the future of Network Monitoring?

Is User Experience (UX) monitoring going to be the future of network monitoring? I think that the changing nature of networking is going to mean that our devices can tell us much more about what’s going on. This will change the way we think about network monitoring.


Historically we’ve focused on device & interface stats. Those tell us how our systems are performing, but don't tell us much about the end-user experience. SNMP is great for collecting device & interface counters, but it doesn't say much about the applications.


NetFlow made our lives better by giving us visibility into the traffic mix on the wire. But it couldn't say much about whether the application or the network was the pain point. We need to go deeper into analysing traffic. We've done that with network sniffers, and tools like Solarwinds Quality of Experience help make it accessible. But we could only look at a limited number of points in the network. Typical routers & switches don't look deep into the traffic flows, and can't tell us much.


This is starting to change. The new SD-WAN (Software-Defined WAN) vendors do deep inspection of application performance. They use this to decide how to steer traffic. This means they’ve got all sorts of statistics on the user experience, and they make this data available via API. So in theory we could also plug this data into our network monitoring systems to see how apps are performing across the network. The trick will be in getting those integrations to work, and making sense of it all.


There are many challenges in making this all work. Right now all the SD-WAN vendors will have their own APIs and data exchange formats. We don't yet have standardised measures of performance either. Voice has MOS, although there are arguments about how valid it is. We don't yet have an equivalent for apps like HTTP or SQL.


Standardising around SNMP took time, and it can still be painful today. But I'm hopeful that we'll figure it out. How would it change the way you look at network monitoring if we could measure the user experience from almost any network device? Will we even be able to make sense of all that data? I sure hope so.

  • I figure that you need both. Synthetic transactions to provide a known baseline, and then real user data so that you can see what's going on across a wide range of end-users. Sometimes you don't have enough real user transactions to get meaningful data though - e.g. for low traffic sites, real-user stats can be skewed by a few outliers

  • As IT is viewed by CIOs and CTOs as a service department with in their businesses UX becomes increasingly more important to them as well. The satisfaction in the services we deliver to our customers (whether external or internal) is going to become part of the key success indicators we are evaluated against. We must be willing to adapt and to adopt those technologies that allow us to see more than just a green/red view out of systems. Slow is down and user experience data is a important part to identifying the impact of a hardware/software failure or degradation.

    I think there is a greater question and debate is about synthetic user data vs real user data.

  • The UX aspect of monitoring is just part of the puzzle.

    Yes you need it as well as the tools to get the complete picture.

    Over time you can correlate certain patterns to relate to specific conditions.

    Sometimes the first indicator of trouble is the user experience..sometimes it is hardware or something not as obvious in the network.

    You continually need different view into the forest otherwise your view will always be blocked by the same trees....

  • Ultimately user experience is really the only thing that matters. yes, we can monitor all the individual nodes and interfaces and such, but the true indicator is always going to be the user experience. This becomes more true if your apps have redundancy throughout the layers, including the network. At this point i care a lot lot less if one device out of a pool fails. Is the user experiencing issues? No? Then I can fix it at my leisure. The user doesn't give a care about a node or an interface or a device, they only care if they can do what they need to do. The detailed information is for us in the background that need to keep things running.

    Visit any big service provider like a Microsoft or a Google or a Facebook and ask them if they care if a device goes down. If you spent time touring one of their container operations you'd see that there are many many failed nodes in the system. Sometimes they don't replace them for days because it doesn't matter because they have such a high level of redundancy. Ultimately its about the UX. Go into a small mom and pop shop and yes the individual components or nodes now become much more critical. At this level the UX is a lower concern because any failure results in an outage. Appstack is nice and all, and it helps with the troubleshooting aspect, but using a tool like WPM is much more indiciative of performance. Our app teams are getting more and more involved with our WPM because it is giving them quantifiable data of performance for a web app globally.

  • Yes, you're quite right that we will always need the base data to be able to diagnose the problems. I've found that collecting the user experience data (either through some form of synthetic transactions, or monitoring real traffic) helps with telling that there is a problem somewhere. Sometimes those investigations then lead you to learn that there's some other base-level metric you need to be collecting.

    AppStack does look good. There's a lot of work to pull everything together, and to get useful information out of it. Hopefully they've got the right base pieces in place so they can quickly iterate on it.

Thwack - Symbolize TM, R, and C