We are very excited to talk to you about this feature packed Beta of Database Performance Analyzer (DPA). Since there is so much in this Beta, we will be covering the features in multiple blog posts. In this post I get the privilege of introducing.
If you are interested in being a part of the Beta program, please click the link below. The only requirement is that you must be an existing DPA/Ignite product owner under active maintenance.
WHAT IS DPA?
At this point you may be thinking, 'but what is this DPA of which you speak'. Database Performance Analyzer (formerly Confio Ignite) is a database focused product that uses 'Response Time Analysis' to quickly decompose your queries and identify why they are slow - and then helps you fix them. How does this fit in with the SolarWinds products you already own and love? See this post to find out more detail: SQL Server Two Ways - SAM AppInsight for SQL and Database Performance Analyzer
Performance I/O - Drill Down Into I/O Bottlenecks at the File and Drive level
‘It’s slow’ is often all a DBA ever hears when there are performance problems in the database. Slow queries can be attributed to many things in the environment. DBAs in the past have always used wait events to detect I/O problems, now we have another wait to detect these I/O problems!!!
Let's say it is a 'normal' day in the life as a DBA and Performance Tuning. The customers have complained that the system 'Was Slow' on April 7th sometime in their nightly load, from 1am to 2am. You pull up DPA to see if you see anything that jumps out at you drilling into this time period
However nothing jumps out at you, let's look at the wait events during this time frame. By clicking on the bar for the 1am to 2am time frame I can see that there is wait that is I/O related here (PAGEIOLATCH_SH, WRITELOG, LOGBUFFER, and PAGEIOLATCH_EX)
Happily we can now dive further in with Performance I/O. We now have another way to look at this data to let me know if this was an I/O Problem.
Note: From this screen or the main Trend screen you will see that we now have 'Storage I/O'.
At this point, you may be thinking, ‘How are the Top 15 Data Files ranked’. Well we have come up with an 'I/O Rating' that rates the data files for the instance. The I/O Rating behind the scenes of DPA does the same thing you would do when determining which data file should be looked at.
First, DPA will find the files that are in a 'Critical' or 'Warning' state, based on industry standards by default. Then for these files we perform a calculation based on the latency (either Read or Write depending on which tab you are on), the throughput and most importantly the duration the files have been in the 'Critical' or 'Warning' state. This calculation shows you the files you should review first.
If there are no files that show up here, that doesn't mean there was no I/O against your database, it just means that the I/O related activity was never in the 'Critical' or 'Warning' state
As an added troubleshooting feature, you can compare these files with what the wait is in the database outside of I/O Wait. This is important because you want to know what your aggregate I/O wait, however you also will want to know if this wait is the largest type of wait or if it is just a small percentage of wait (could we have high Memory/CPU wait, or high lock wait). In our example, over 90% of the wait occurring in this instance is I/O related wait. Because of this, we will want to drill in further to get more details.
Let's continue our troubleshooting for April 7th from 1am to 2am. Just like with the Trending data, you can drill into the 'Storage I/O' data. Click on April 7th and you will see that day's data, then as we did previously, click on the 1am bar and we will see the detailed hour graph. As we can see, there is a lot of activity in the early morning hours. Specifically I see a lot of activity on the 'C:\..ignite_ignite_depas13.mdf'' file during our 1am to 2am time frame.
Select this file from the drop down list.
Once you select this file, NOW you have hit the MOTHER LOAD of data!!! Not only are there several ways to view I/O Performance for this file, we can also look at the SQL statements being run contributing to the I/O wait (this tab is highlighted on the graph), against this particular file, the Programs this SQL is being generated from, the DB Users that ran the SQL, and the Machines this SQL was sent from.
As you click through the different links at the top of the page,DPA gives you different ways to slice the data to assist in troubleshooting. For our example, we can see a spike that is out of the norm or 'Typical Hour' for our file. This leads us to believe there may have in fact been an issue with this file.
Let's look a little further though.
I click on the 'Comparison over Files' link and I can see that in comparing my file, I see the average for the other files (both on the same drive and for the instance I am on) still have much lower latency. Yep, it sure continues to point to my file is isolated with this high latency.
What if the problem is that I just had higher throughput, more records, than normal. This could be a good thing, maybe more product is being sold!!! Let's click on the last tab to find out.
Clicking on 'Throughput Analysis' I can see clearly, I need to talk to my Storage Admins. My throughput is actually lower than at other times though my latency is high. This indicates a Hardware issue vs a Software (more product clogging up the latency). If throughput were much higher, we might surmise that the high I/O was due to more data. Looks to me the problem happened right after 1am.
I can now confidently take my findings to my customers identifying when the problem occurred specifically. I can then talk to my Storage Admins to figure out the best fix so that this peak is handled in the appropriate manner.
AND.. the BEST part is it just took a couple of clicks!!!
I/O Performance is a great feature, but WAIT, there’s more …..
Mass Registration Wizard - Quickly add all your databases into DPA
If you have a large environment, today you have to go through a tedious process to manually register each database instance. In this beta you can try out the Mass Registration Wizard which will make it much easier to load multiple instances into DPA. What a great feature to use for testing the beta!!!
For either Oracle or MS SQL Server Repositories, the steps are the same. You go to Options and select the Mass Registration button which brings you to the initial screen in a 3 step process.
The templates you will need come with the product and are located where you have DPA installed <install directory>/iwc/tomcat/ignite_config/registration.
Just take this csv file, update it with your information and select to load it into your respository
When you select to load the file, DPA goes through an initial check to make sure all the fields are filled in properly and that we can connect with the credentials provided.
Click on the 'Register Valid Databases'
After clicking on the 'Register Valid Databases' DPA will go through the list and try to connect to the desired monitored instance and load the required information into the repository.
If there are instances that didn't get registered, DPA will show in this screen a red 'Invalid' icon with detailed information on why the instance could not be registered.
If a monitored instance already exists, you will get a red 'Invalid' and a comment in the status column stating it is already registered
GUI Improvements - Its not always just about the look
Many of the screens you’ll see, and have seen in this blog, during the Beta process will just have a cleaner, crisper and much more SolarWinds-ier feel. Yes, we are rebranding the product for SolarWinds, but this is more than skin deep - we have also spent a lot of time improving usability and the speed of the product. We have also spent a lot of making GUI enhancements that improve usability and give it more modern aesthetic.
Is that all you say? Not by a longshot. There will be more blog posts coming to continue describing the new features of DPA including:
Resource Metrics Baselines
Custom Metric Alerts.
So, stay at the edge of your seat, the edge of Thwack and STAY TUNED!!!
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community.
More than 150,000 members are here to solve problems, share technology and best practices, and directly
contribute to our product development process.
Learn more today by joining now.