Extend Your Modern APM Strategy
Application Performance Management (APM) is quickly becoming an essential tool in any IT department. Applications are the heart of IT. Performance is the new SLA. The approach to APM is over a decade old, and is now imperfect. A new, more complete mindset is required for modern APM.
This session will explore the five elements of a modern approach to APM, including product demonstrations. The session will cover concepts from WPM to response-time analysis. After attending this session, you will have a better understanding of what an APM approach entails and the technologies that are available to support each of the five fundamental aspects of APM.
After attending this session, you will have a better understanding of what a comprehensive APM strategy entails and what technologies are available to support each of the five elements.
Howdy, THWACKcampers. We have an all-star performance lined up for you today to help you extend your modern APM strategy. I'm Kong Yang, your host, SolarWinds Head Geek and today, I'm joined by Jerry Schwartz, Director of Product Marketing for our Monitoring Cloud Business Unit. Welcome, Jerry.
Hey, thanks Kong.
Well Jerry, today we're talking APM, and if we were to ask 10 IT professionals out there, we may very well get 10 different responses, but the end goal remains the same. It's to keep the application healthy and running, correct?
Absolutely. I mean, there are a lot of different definitions of APM that are out there, but you're right. Overall, I think most people could agree that APM, or Application Performance Management, the goal is to keep those applications up, running smoothly. But APM is not necessarily new. There are a lot of vendors that have had products out there for years and years, but APM is an area that's changing really rapidly.
Yeah, it's changing because the application is changing. The way that you integrate and deliver that application and those services are changing.
You're right. Think about the infrastructure that applications are on nowadays. There are so many different options. Think public cloud, private cloud, hybrid cloud, and then the architectures of which those applications are now built. And on those new infrastructures and those new options, think monolithic versus the microservices. And also, you of course have user expectations that are a lot different. Nowadays, a slow application is as good as a down application.
Yeah, and all those challenges, in terms of choices, right? You can deliver a solution, an application, in a multitude of ways, containers, microservices, service lists, a lot of service-oriented architecture out there.
Yeah. With all these changes— actually, even Gartner, within the last year, Gartner had to change their definition of APM to really account for a lot of these new trends. Monitorama is a conference that really focuses on monitoring and this past couple months ago, it just happened, and application distributed tracing was the thing that everyone was talking about. Our CTO, Joe Kim, wrote a really awesome article about the differing approaches to Application Performance Management, and we've made that available to you on the...
Helpful Resource link.
Helpful Resource link, that's right, and it was written in our THWACK Geek Speak blog, so I would recommend you go check it out.
So Jerry, how can we help our THWACKcampers succeed, become successful in their APM strategy?
That's a great question. So overall, what we want to do today is talk about five areas that really are an extension of the APM definition. So, we're going to talk to you about that now. And again, the first element is going to be your end-user. The end user, it's really important that you are actually able to measure the customer experience that they're having and be able to know how to improve it. There are two ways we're going to talk about that today, Kong. One is, of course, synthetic monitoring, and the second is real user monitoring.
Awesome, that's how you make revenue and money.
Absolutely. The second is code-centric, an area that we want to talk about. So that's really understanding how to tie your code in with the infrastructure, so that you know, specifically, the lines of code that you need to change and the device it ran on. The other thing that I'd mentioned before that we're going to talk a little bit about today is the benefits of capturing those application-distributed traces across that entire application pathology. Again, it's a hot area that a lot of people are talking about, and we definitely want to cover that today.
Excellent, but even that code has to run on some...
Yeah, operations, of course. And so, we want to be also talking to you today about the operations-centric area of APM, and that's really understanding the detailed application performance metrics, and understanding how you can improve the application performance on every layer of the IT stack.
Of course, even with continuous innovation, continuous delivery, you still need a system of record, and what better than database performance management?
That brings us to the database, of course. So today, we're going to show you how to take a look at your wait time analytics and understand how to improve those slow performing databases. And of course, last but not least, we can't forget about the network. The network is really an important aspect of this because let's face it: if the network is down, everything is down. And so, we want to talk to you a little bit about the networking, and take a look at some of those network health metrics so that you can tell if there's an issue, whether it's the application or is it the network?
Thanks, Jerry. End-user monitoring, to extend your modern APM approach, Digital transformation is here. It's a buzzword that we hear all the time, and it's—at the center, is applications. But applications are changing. We mentioned it, and it's changing from systems of record to systems of continuous integration and delivery. A lot of change, high volume, variety, and velocity. Jerry, end-user monitoring can be broken into two pieces, right? Real user monitoring and synthetic transactions monitoring. Can you peel the onion back for our THWACKcampers?
Absolutely. So, let's jump into it, and talk about real user monitoring first. Real user monitoring captures the average or the median page load time of real users as they're interacting with your website, or your web app. So, it can really help address questions like, what's the average load time for a user in India? What's the average load time, page load time, for a user using a mobile phone or a tablet? So those are the two key areas it can help with, but there's actually other broader categories where you can capture some additional data. Again, you have the geolocation, you have browser type, you have the different platforms I mentioned before: workstations, mobile phones, tablets. And then you also have what we call loading states, whether it's the network, the back end, the front end.
And all of those affect the digital experience. So, just think about when you go to a website to make a purchase, or if you're looking for something from a website, any latency there usually takes you off of that website.
Absolutely, and that's one of the reasons why a lot of solutions have what's called an Apdex score. The Apdex is an industry standard for customer satisfaction. It stands for Application Performance Index. And so the awesome thing about having an Apdex score in your RUM, RUM being Real User Monitoring, solution is that you can instantly see the percent of users that are satisfied with their experience, they're tolerating their experience, or they might be downright frustrated with that experience. And so RUM will give you the, RUM and specifically the Apdex score, will give you that information.
Awesome. And another thing that can influence that piece is synthetic transactions monitoring. It's a tool that one can leverage.
Yeah, that's a great point, Kong. Synthetic transaction monitoring is really where the website monitoring is done by emulating the actual users through either web browser emulation or scripts, and so that helps you address questions like, is my website up? Or are my key transactions working?
Exactly, and you can also draw trends and find baselines of what the experience should be.
Exactly, and that script that I mentioned before, that emulation will recreate the typical paths that people go on your website or your web app. Let's say it's as simple as going to a homepage that might be one of your synthetic uptime checks. If it's more complex, you can also do things like actually searching for a product, finding it, putting it in the shopping cart, and then checking out. That would be one of your synthetic transaction monitoring checks. So these things, the synthetic transaction, super important. Why? Because you want to be able to find those problems before your actual end- users or customers do.
Thanks, Jerry. Let's see it in practice. This is the Orion dashboard, and we're going to go into our Web Transactions Summary. This is Web Performance Monitor, and what you see here is the summary page of it. What I'll do is, I'm going to pick one of these applications. So, let's look at one transaction. And this is a synthetic transaction as defined by you previously, right? So, it's a series of recorded steps into it, so we're looking at Outlook, Office 365 from EAST data center. This is a summary of what the transaction is. On the right: up, down, critical status, as you mentioned before. You can zoom out, zoom in, get granularity, but that's what transaction availability is. You have the min/max duration. So, we talked about baselines, trends, being able to see what the average response time of this particular synthetic transaction should be. On the Transactions Details, you can see this number of steps. There's three steps: signing in, looking in your inbox, and signing out, along with the durations, up/down status, into these pieces. You can also see the screenshots of what it should look like and as you step through each of the respective steps within the synthetic transaction....
That's a great feature.
There's also notions of Step Duration. So, you can look at the different latencies of each of the steps overlaid. So basically, you can correlate across those steps as well as, there's no dependencies here, but if there were any dependencies, application dependencies, it would show up over here as well. There is also our mini-apps stack view, which shows it across the layers, right? So, across Server Application Monitor, Virtualization Manager, Storage Resource Monitor, and Web Performance Monitor. Now if you're looking for more details on how to actually record synthetic transaction monitoring, SolarWinds Lab number 18 at minute marker 25, the original Head Geek, Patrick Hubbard and Lawrence Garvin walks you through it. And they walk you through it wonderfully in eight minutes, so there is that. That's the reason why I didn't go through.
Yeah, and I've seen that done before. It's a really easy way to record those transactions in WPM, so I'll be excited—I'm going to check it out myself.
Awesome, Jerry. Now let's see real user monitoring.
Sure, I'm going to show you guys our Pingdom application and Pingdom is a SaaS-based website monitoring tool that has over 700,000 users. Our users love it because it's super reliable, easy to use, and very affordable, but let's check out here our monitoring. As you can see, we have uptime, page speed, transaction, and real user monitoring. I'm going to just show you how easy it is to set a new uptime check. So if I wanted to add a new uptime check, I could just name it. I'm going to call it SolarWinds, and I'm going to check the SolarWinds website once every minute and I could just go HTTP. Now, it would probably help if I did the right URL. Solarwinds.com.
Yeah, and Pingdom is any public website, correct?
Wonderful. So, with our Web Performance Monitor and Pingdom, one can get both—inject basically synthetic transactions and schedule it so that you can see what the latency trends are for that transaction, the experience there. But this, with Pingdom, you can get the real user experience, what they're seeing.
Exactly. So I'm going to also show you a little bit about our reporting. So this is a real user monitoring report. So this is showing at a high level, the average median load time is three seconds. It also shows exactly the user experience and how fast the page is loading. I can also check it out by country, which is kind of cool. So in the United States, it's averaging, loading on average 2.7 seconds, where perhaps in Saudi Arabia, it's loading in about four seconds. I can see the distribution, the loading states I mentioned before, and here's that really cool Apdex score. So I can see with this website, about 66% of our users are having a satisfying experience. That is, the page load is between zero and four seconds, and so, we might want to take a look at that. And the last thing I did want to show is if I swing back to uptime, and just for fun, I'm going to go and check out the website Pinterest. So here's the uptime for Pinterest, and if you look over the last 30 days, it appears that there was a single downtime incident. So now, Pingdom's automatically notified the user that there was a downtime, and they run a root-cause analysis, so now I can see detailed information. Here is the first probe monitor that saw that it was down. In order to eliminate false positives, we run a second check, but I can also see the root cause analysis here, and I can actually get more detail on exactly what went wrong and what are some of the things I can do to detect that.
Excellent, so you have, with Pingdom, troubleshooting feature that you can leverage to get to—surface the single point of truth very quickly. Jerry, thank you for walking through synthetic transactions monitoring as well as real user monitoring. We will have, in the resource section, a one-pager differentiating, showing the different values of Pingdom and Web Performance Monitor. Thanks, Jerry.
Code-centric APM is an important aspect to extend your modern APM strategy. I'm joined by Dan Kuebrich, Director of Engineering Architecture for SolarWinds TraceView. Welcome, Dan.
Happy to be here, Kong.
So Dan, what is code-centric APM?
So, when we think about code-centric, we're kind of talking about what is traditional APM, or what we traditionally call APM. So, it's APM that's looking at, specifically, the application tiers, the code that's running inside them, how that's performing, what the application tiers are talking to, how that impacts end-user experience. So it's really kind of at the center of your monitoring portfolio within the application.
And our THWACKcampers have a question. In general, how does that relate to tracing?
So, I think we should start by defining tracing. Tracing is very related to APM, but it's kind of a methodology that might underlie it. When we talk about transaction tracing, we're talking about following requests, or jobs, or transactions through an application from the moment they enter the system until they are fully processed. And so, what we're doing is basically looking at where they're spending time in all the different tiers of the application, different services. If you're talking about a microservice architecture, what database they're talking to, and where the bottlenecks are.
And you're speaking about the code, correct?
Exactly, the code and its interactions with all the components, so it's kind of tying together the pieces.
So why does this matter?
So, if you kind of refer back to something that you and Jerry were talking about at the top of this session, it's that applications are changing faster and faster. We're deploying new releases more often to ship new features. The infrastructure is changing, especially if we're in a cloud environment. It might be changing while the application is running, and user behavior is changing with regard to the application all the time. And so we need monitoring that's looking at what the application is doing, and how that's performing for them.
You mentioned that word, change, and there's an abundance of change, volume, variety, velocity. Some classify that as big data in there. How does this impact APM, specifically, code-centric APM? What are the challenges there?
Exactly. So, I think when we talk about these changes. Some of them are moving to the cloud. Some of them are introducing new languages into all these different Tuesday application, and so, thinking about rolling out APM can be rather daunting. And I think, in the past, it has been, and so, it's kind of earned that reputation for requiring a lot of configuration, being very complex, and even tied to the application's code, but the reality is that's not necessarily the case anymore. And in fact, modern APM solutions are developed with an eye for ease of use, rapid deployment, and if they are SaaS-based, they tend to be very cost effective and scalable as well, so you don't have to worry about managing extra instances in your data center just to get more visibility. So I think, actually, it's even to the point where we can just sign up for trials online and get up and running in minutes with a lot of the modern APM products, and that's what our customers are doing more and more.
Awesome, let's see it in practice, the availability, agility, and scalability of tracing.
Dan, what do we have in front of us?
Well, what we're looking at right now is not an APM product, but it's something we might want to monitor with an APM product, and I do because this is a hotel-booking site that hypothetically, I'm in charge of running. And so, got this nice portfolio of hotels with very affordable rooms but if things are slow, if things are down, that means we're losing money, right? And so, this application looks very simple from an end-user perspective. It turns out to be very complex. It's been built over time. Lots of different integrated systems. I'm sure none of us has run into applications like that in the past.
We call that legacy. We also call those Frankenstein apps.
Well, this is verging on one of those. I'm sure we've seen worse, but let's take a look at what it might look like in TraceView, an APM solution that's part of the SolarWinds Monitoring Cloud. And so, the first thing we're seeing here is a topology map of our application's key services. So, to recap what we might have done to get here, it's a simple software install. I'm installing an agent on each of the application tiers that powers this application. We were just looking at the front-end web tier, for instance, and that's what I've selected in this diagram. It talks to a booking service and an auth service. Each one of those might have more dependencies talk to a different piece of infrastructure, and so on.
Excellent, so after discovery, you get a topology map of all the interdependencies within your app.
Exactly, and so we can start to see things like when users are logging in, it's making requests to the auth service. There's a lot of requests to the booking service. That's good. Comparably fewer ones from the booking service to the transaction service, but that's how we make our money, right? We're doing these hotel rooms, and so we have this kind of integrated mesh of services and for each one, we want to know its KPIs. Is it healthy? How long is it taking to fulfill transactions? Are we under a severe load? Are there increased error rates? How's it running in general? If we see something that's not performing well, we might highlight that in red here. So for instance, our analysis worker, not doing so well, throwing a lot of errors. That might be something we want to drill down on. So, a good APM solution is going to start from the high level and say, how does my system look? How is it performing? But then also let you drill down and get a lot of detail all the way to those traces that we were talking about.
Very nice, so granularity of levels to be able to start high and then drill down into it.
And so, let's do that. Let's take a look at that front end we were just looking at but kind of from the server's point of view. So we were just looking at it from the consumer's point of view. Now, we're drilling in. I'm looking at the performance of this application over the past day. And so, the first thing we're seeing, got that average latency to fulfill requests. We're under 50 milliseconds, which is pretty good, but where's that time being spent? And so this chart is going to break that latency down into the different components of our system. Our front end here is a Ruby on Rails app that was put in front of a bunch of other services, and it turns out, Ruby on Rails, actually fairly efficient. Calls to our auth service, fairly efficient, but where we're really spending a lot of time is all those calls to the booking service. So already, I know if the front end of the web app is slow right now, I should actually be looking at the booking service, and I can start talking to that team about what's going on. We're also breaking application's performance down in a number of different ways. Different pages, if there's particular hot spots. Some pages are taking a lot longer than our average. We should know about that. If they're throwing more errors, we should know about that. And so on, including a breakdown by different hosts. Which hosts are running the application tier right now? Ae we even performing worse? Is an infrastructure issue actually impacting customers? Status codes, and so on, from the application. So, there's a couple of easy wins we can get, again, with no configuration needed to start diving into this performance data.
Awesome, so you can very quickly see the different layers, all the interdependencies, which one is the actual latency, so that you can get to the root cause much faster.
Exactly, and what we can actually do beyond this is start to drill down and get to that root cause that you just mentioned. So, for instance, I'm wondering why requests to this particular page are taking about a second, and I'll just click on that to start to zoom in on it, and now, we're looking at the performance of just requests to that end point. We can see it's really dominated by calls to the booking service. And if I want to know exactly what was going on, maybe I would say, hey developers, let's start looking at the logs, something like that. I can actually go take a look at those traces we were talking about earlier. And so, a trace, again, is the path of a single request through all the tiers of this application. So here's ones that match that particular hotel-room purchase page that it looks like we're seeing. And so, I'm just going to grab one of these from about an hour ago, and we'll see what was going on. And we've drilled all the way down to the needle in our haystack here. This was a single request from this particular client IP. Happened just earlier this morning that ended up hitting four different tiers of the application. So it started in our front end, went down to the booking service, the auth service, and ultimately the transaction service. So the complexity is kind of revealing itself and we let you dig in with a lot of detail, and this is a great place where you can start to hand off to folks that are actually working on it and say, hey, it looks like the application is running this particular query 600 times. So that seems kind of inefficient. Is that redundant? Also, there's a single call responsible for about 800 milliseconds, and that's the kind of thing that you want to be able to drill down too quickly.
Perfect, Dan, thank you for walking us through a real-world tracing example, especially as you use code-centric monitoring to extend your modern APM strategy. Operations-centric APM is another method one can use— you can use—to extend your modern APM approach. I have Chris Paap, Product Manager of Virtualization Manager.
Thanks for having me.
And I have Steven Hunt, Product Manager for Server and Application Monitor.
Glad to be here.
Welcome, gentlemen. So the big question around application performance management is what is ops-centric APM?
So, it's a really, really good question. We hear it quite a bit. When everyone is talking about application performance management, there's some confusion, right? We've outlined some aspects of it with Jerry and Dan earlier. But what a lot of people don't realize is there's things that you've probably been doing for quite some time around collecting metrics of the applications, the operating system, the infrastructure stack that's underneath all of that, and needing to be able to visualize those performance issues as they come--and being able to understand where those problems might exist from just a general performance standpoint.
I would add to that how to correlate all that information. So it's usable, providing any context. And then, I think one of the things that Steven was talking about, that confusion, it's not just one thing. It's a combination of tools that make that ops-centric monitoring.
Yeah, I think when you think of operations, there's so many services that form the basis of operations. You have so many tech constructs from servers, storage, virtualization, so many layers, operating systems in there that impact the application performance. And the end goal of any good APM strategy is to keep, as our CTO, Joe Kim says; keep the application healthy and running.
One of the things that people do get confused about is they typically talk about, you know, we were discussing code-centric earlier. And there's a lot of homegrown apps out there. But there's a lot of off-the-shelf apps that are very, very important to be able to understand the performance and the issues that are associated with those. And so, you need a blend of solutions that can be able to deliver the understanding between those homegrown or custom-written applications and those things that you're buying from third-party vendors that are out there in the market.
Great point, because at the end of the day, performance matters because it drives revenue, and that's why we have jobs. With that, gents, let's see it in practice.
Right so, AppStack Environment will show you from application endpoint all the way back to backend disc, correct? So, the best thing about that is nothing happens in the silo, so we're able to quickly see, at a glance, what's going on in our environment, and identify with the normal down, critical warning status. If it's in a good state or not. And then, drill in from there, on that node.
Yeah, it's all the interdependencies from the application layer through the server layer, back to backend storage. You got your web transactions in there, the virtualization layer.
Correct, and you know, the way these usually come into, from an IT standpoint is application is slow, application is down, not much more data than that, so when you're doing that, it's a little bit of a discovery mode that the IT admin is in at that point. So their eye is going to get drawn to what is actually going down right there. Whatever they were informed of that application that's having the performance issue, they go and scroll across the AppStack Environment, and then, click on that to determine what's going on.
And from there, you could filter down, right? You can pinpoint what application is called out from that generic ticket, app is slow, and then, see all the interdependencies.
Right, and it may not even be that the app is slow itself, or the app is having a problem. The app is slow as a result of something else, so you want to see that correlation across those different pillars of IT responsibility. So once you click on say an application that you know is having a problem, but the application looks to be doing fine, or if it is having a problem, it's going to show you, actually, the relationship status of this application sitting on this server right here, right? So it's a one-to-one ration, one-to-two ratio, depending on what that is. It'll show you if it's virtualization, if it has a database associated to it, do your monitoring with DPA. If you have SRM installed, it'll show you that storage it's sitting on, so you can actually pinpoint and go down that stack to find out what that issue is.
Yes, and customers are using that to keep their applications running healthy and running, right? Healthy and running, and key tenets of modern APM strategy. What about the situation where you may not have as much info or expertise or knowledge around what's happening? Or the converse, what if you have a lot of expertise on there, but you don't know what subsystem?
So the next step there is go to advanced troubleshooting. You know you have a problem or let's reverse it. Let's step back from it a bit. You know you have a period of time that you had an issue in, and that's where you'd step out of the AppStack Environment and dig into your PerfStack Environment.
So what we're going to be seeing here is a perfect example of correlation of time series events off of different pillars of platforms and correlate from that time series when you were having the problem where the issue was. So you do a little bit of troubleshoot and discovery here, a little bit of sleuthing so you know has it been.
Okay, so I can take different tech constructs and pull in different performance metrics across those that are time series data.
Correct, and specific data, so we're only going to show you data that's associated to that node. In this case, we have a server, lab-dem-orion.demo, and it's only going to show you metrics such as CPU memory, status alerts, events that are related to that, that are actually what we can graph.
Awesome. Chris, so you've shown us AppStack. You've shown us PerfStack as tools that our customers are using to extend their modern APM strategy via operations-centric methods.
Correct, it's just one more tool to have in their toolbox to find out and get to root cause faster. Mean time to resolution.
Thank you, Chris.
Database performance is vital to a modern APM strategy. Today I have Rob Mandeville, Senior PMM of Database Performance Analyzer, coming in to talk about why database performance is so important to APM. Welcome, Rob.
Thanks, thanks for having me. Yeah, so most modern applications, they definitely rely on database back ends, and databases can play a significant part in performance issues. They can actually contribute quite a big hit to application slowness.
Yeah, in this era of continuous integration and continuous delivery of applications, applications can be constructed in so many ways. The database is still required to record those transactions to keep a system of record.
Yeah, that's right, that's right. And I mean, in today's society, slow is the new down. We've become a society that expects very performing and very fast service.
Yeah, we are a society of instant gratification, and it's in our social channels, it's in the applications that we utilize, it's in the responses of websites, the digital transformation that everybody talks about.
Exactly, and without that picture of the database performance, you are missing a big part of the picture.
Now Rob, you have a wonderful analogy that you use that compares and contrasts, basically, database performance in terms of performance versus health.
Yeah, so when I first started in this position, I really kind of asked myself, what is the difference between health versus performance, and I think that there's still this conception out there that the two can be equal, or you know, synonymous. But when I really started thinking about it, they are not, so the big difference is health means all systems are go. There's an absence of illness, right? Their performance is the ability to actually achieve, so let's take it to a personal level. Okay, so you go to your doctor, and you ask them, am I healthy? And your doctor's probably going to, what, do some tests, get some data...
Draw some blood.
Draw some blood, look at cholesterol levels, exactly. Resting heart rate, things like that. BMI, fat test, yeah. But given all that data, can your doctor now tell you, can you run an eight-minute mile? No, right, the answer is no, and that's because we were talking about health, and now we're talking about performance. So really, for the doctor to answer that question, he'd have to go out to a track with you with a stopwatch and actually see how you perform.
Exactly, so health is up/down, are you okay in there? Performance, now, are you delivering towards the service level agreement, the quality of service that you need to in order to be successful?
And that kind of brings in a good example of where the products that we offer can differentiate a little bit. So, SAM is more of an infrastructure health monitor, making sure all systems are go, making sure everything's ready, capacity is good. Database Performance Analyzer says, am I achieving? Am I actually performing?
And with respect to server and application monitoring, you're referring to app insight for SQL server, correct?
Rob, what do we have in front of us as we look at database performance in the context of extending your modern APM strategy?
Yeah, fantastic, so let's take a look at Database Performance Analyzer. So what you'll notice, first off here, is a historical trend of my database environment, right? This paints a picture of my performance story. Really, what it shows is, what's norm within my environment. Now it's not saying it's good or bad, it's just saying, that's what I've come to expect.
Yeah, and having a baseline is key, because every one of your environments are unique.
Absolutely, and it's a great way to tell, am I performing better today, or worse today, than I have historically?
So in addition to that historical view, I can see that there's multiple dimensions that we monitor in, so we see the SQL statements themselves, we can look at the waits. So the waits is a nice one to look at. What are the biggest bottlenecks in my environment? In other words, as the queries come into the database engine, we track exactly the activities that they spend to satisfy that request from the application, so that is called response time analytics.
Exactly, and to the point you made earlier about the difference between health and performance, this is exactly performance.
This is exactly it, right. So understanding the activities, and how much time is spent in those activities at the database engine, lets you find out where the bottlenecks are, so that's how you can tune your applications to run faster and get that response back to the requester.
Yep, so as we dive into a specific day, very easy drill-in capabilities. You can see the profile may change depending on the day. If I drill in far enough, gives me some great information about, again, those activities, where I'm spending all my time. If I want to know the queries that are most impactful to that activity, I just drill in a little bit further, and I get the SQL statements that are involved.
Excellent, and customers are using this to optimize their SQL queries.
That's right, and therefore, deliver faster response time from an application perspective. So this really helps identify the differentiation, also. Like in Server and Application Monitor, you have AppInsight, which can give you the overall health of the systems involved and some of the information about the database instance itself and its health. But to get the performance analytics that we're looking at here, that's where Database Performance Analyzer comes in.
Well, thank you, Rob, for sharing with us multiple information on how database performance is key in extending your modern APM approach.
Thanks for having me.
Every modern APM strategy needs a networking perspective. And by that, I mean, the first rule of IT fight club is blame the network. Am I right, Chris?
No, you're wrong.
I'm joined today by Chris O'Brien, Product Manager of Network Performance Monitor, and we're going to talk about the challenges, and why network performance monitoring is needed in every APM strategy. Thank you for joining us, Chris.
Yeah, happy to be here. I think you can't deliver applications well, in most cases, without doing good networking, so that's definitely a piece of it.
Yeah, it connects everything, right? As seen in our 2017 IT Trends Report, the IT world is becoming more hybrid, hybrid IT. So, you have services that internal IT is delivering, and they may be colo, they may be on-premises, but you also have those cloud service providers, and you lose a lot of visibility in that space.
Yeah, absolutely. The cloud service providers as a destination network and then, all the stuff in between like, your ISP, backbone providers, all of that stuff.
Yep, so network monitoring becomes an important part in ensuring that your application is healthy and performing optimally. So Chris, why don't you show us in practice how this is done.
Yeah, sure, so there's a couple of tools we have in the Orion tool belt that help us with this. One of our most recent ones that really focuses in on the application perspective of your network is called NetPath. Some of you have probably seen it. Another one to look into would be IP SLA in the VNQM tool, but let's take a look at NetPath. So we've jumped over through Home and NetPath, and in here, I've got all of the applications that I'm providing transit to, right? So as a network guy, I'm providing transit to my applications. My applications are sending transactions over the network, and the speed of that, and the quality of that connectivity has a—tends to, depending on the application, but tends to have a big impact on how the user perceives the application, right? Is this performing well, or slow, or what have you? So you can see already, we've gotten to the NetPath screen. We're listing applications. We're not listing network components, so this is definitely an application-first approach. So if we drill into one of these—Salesforce, for instance, we can see that application's performance in terms of network delivery from a specific location. Here, we're looking from the Austin Lab probe, but one of the key components of the application performance as delivered, or as sort of participated in from the network, is it really depends on where you're coming from and where you're going to, right? And so, as you have different branch offices, you would expect the Austin office has a different experience than your office in India, right, slightly different.
So you want to be able to have that perspective as a network administrator to make sure you're providing good service to all of your applications.
And in this particular case, it's a SaaS-y application, Salesforce.
Sure, it's sassy. I think that's what you're looking for me to say. Okay, great, so yeah. So this is Salesforce from our Austin Lab, right? So you can see on the extreme left, we have our node in the Austin Lab, and then, that runs through a bunch of different network components, and you can see they're each introducing some amount of delay, some amount of performance loss. That's just how it works. We have to pay at least physics to move our traffic across the wire, so you can see, it goes through our network for a while here. Then it transits over to our service provider's networks to a backbone provider, who we don't pay. We don't have a relationship with them, but now we're dependent on them, right? And finally over to Salesforce's own network.
Yep, and this is an epitome of a hybrid IT scenario because you got all your hops within your firewall, with all the latencies that are incurred there, and then, you get perspective, visibility into, like you mentioned, the service provider, internet service provider, the backbone, and then, your cloud service provider.
And it's not always the case, but it is usually the case that most of your delay comes in those long connections which tend to be the WAN connections, the internet side. So it's really important to have visibility into that piece. Now another thing to note here is we saw, this was to Google, so what does that mean, right? How is it taking the application into consideration there? What NetPath is doing is, from this source on the left-hand side, we are acting or imitating the application, acting like we're imitating the application. So we'll use the same port number, we'll do a TCP three-way handshake, and we will act like the application for the goal of trying to get the performance metrics and information that tells us how the application is performing across that network, not just your network monitoring stuff.
Awesome, so NetPath is a quick way to quickly surface the single point of truth to answer that question, "Do we blame the network first?"
Yes, yes, or as I would like to think of it: Where's the problem? How do you isolate quickly? Because I only want to work on it if it's a network problem and I definitely want to work on it fast if it is a network problem.
Thank you, Chris, and thank you for joining us as we walked and talked you through extending your modern APM approach. For SolarWinds, I'm Kong Yang.
And I'm Chris O'Brien. Thanks for watching.