This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Is there any way for DPA to show the unlisted wait times (as an "Other" section, perhaps) ?

I use the wait time chart as a general guide to ongoing performance but I have noticed that the early production hours have a habit of changing during the day, long after the hour is finished. I know that this would happen if I change the excluded query list (although I haven't done so) and that it might occasionally happen because of a long running query, but it appears to be mainly due to a design problem.

My suspicion is that this is because DPA is only displaying the top X (or Y%) of SQL Statements and so from 8am to 9am the statements are typically much shorter than the ones during the overnight processing. It is only later in the day, maybe 11am or noon, that those statements take enough proportion of the total activity for them to be displayed.

The solution would be to display either an "other" section or to display an indicator showing the "Total" wait time for each period.

The result of this is that, at 9:05am it looks like the total wait time during the 8-9am hour is maybe 100 minutes while, at 10:30am, it is 400 minutes... Meanwhile, the wait times from 2am to 4am have actually declined;

pastedImage_2.png

A similar effect can happen from one day to the next, although it is usually less obvious. For example, for the 17th October I get these two different charts - one from the next day and one from a week later - the main difference is that the latter one includes SQL Hash 4592265194 between 5255831054 and 3411566237 (note that the scales are different, but the cumulative wait time shown on the latter chart is still longer than the first chart even when the scales are adjusted);

pastedImage_0.png

  • FWIW, here is the latest chart for the 8-9am hour (it is now 3;15pm) - as you can see, it has gone from showing about 100 minutes to 400 minutes to over 700 minutes...

    pastedImage_0.png

  • siggy,

    This looks to me like perhaps DPA's summary tables are not getting the sumarization done in a timely way. The charts you show are all pulled from the tables for sum data. Those tasks run every hour so if the hour for 8 am is getting data in this chart at 3:15 we must have data in detail for that hour that we didn't show in the chart before because the data fell behind.

    DPA has a one main table in detail for wait time on each instance registered consw_id where id is the id of the monitored instance in the table cond. There is a sum table con_sql_sum_id also. The chart you show is made from con_sql_sum_id. We need to check if we are way behind on moving data from the detail table to the sum table.

    Might be best to send in a support case and send along you logs with it. If you do reference this thread and we can make sure to post the solution.

  • Thanks for the reply...

    It looks to me like there are a limited number of queries that are shown on the chart (the list of hashes at the side of the chart varies but only 21 are shown in all of the charts) and so it only displays the activity for the items that are listed.

    So, if it is only displaying the 21 sql statements with the longest wait times, at 9am, many of the activities from 8am to 9am will not be in that top 21 and, instead, it will show lots of activity from earlier than 8am. However, by 3pm, more of the normal production type activity will have accumulated enough wait times to make it into the top 21 for the day and suddenly, they appear, even during the 8-9am window.

    In contrast, during the 3am to 4am period, the number of items shown went from 11 at 9:05am (for a total of 3100 seconds) to 10 at 10:30am (for a total of 2900 seconds) and, at 3:15pm, only 5 items with a total wait time of just 1350 seconds.

    Even if the wait time can increase because it isn't accumulating properly, if surely should never go down.

  • The number of items to show in the charts is 15 by default. This is an item that can be changed in the options menu. Options -> Administration tab -> advanced options -> system options tab. Look for NUMBER_OF_ITEMS_IN_TIMESERIES_CHARTS . You can increase the number to a max of 100 (charts will take more time to render).

    That being said open a case if you so we can look at logs and see what is going on here please.

  • This is really interesting, and something I hadn't before considered. Assume I have the option set to the default of displaying the top 15 and I am looking at the chart for a day. Also assume that from 4am to 5am none of those top 15 were executing, but 16 - 25 (for the day) were executing. Based on what siggy​ is reporting, I would expect to see nothing for that hour, even though I could have significant waits. If we extrapolate that to the opening "last 30 days" view, the chart is only showing stacked columns for each day of the waits for the top 15, hiding waits from everything else during each day.

    If that's true, I would agree that we need an "Other" or "Misc." to display the sum of the rest of the waits for each displayed period. That would let me know that I should drill down to see more detail. Something might be in the top 15 for a day and would show up in a day view that will not currently display at all in the "last 30 days" view. Something might be in the top 15 for an hour and would show up in a hour view that will not currently display at all in the day view. DPA could well be failing to show me something that I would consider to be a significant event or issue.

  • Mark I see what you mean for sure. I am going to ask for a few others to look over the thread here and chime in on it.

  • This was by design to show the top 15 statements for whatever time frame you have set in the product.

    There were a couple reasons for this (business of the chart, but also scaling).

    If we were to show the top 15 + "other" in many systems with something like an ad hoc workload, you'd lose the relative impact of the top 15 (they'd be specs in a sea of gray).

    With limited real estate in the UI, this is a tough one. Couple of thoughts that you can do:

    - Change the number of items in a series (this will allow more than the default 15 SQL statements). This will cause the UI to look more busy...

    - Drill into tighter and tighter time frames to see top 15 specific for that timeframe

    - Click on a day, set the time interval to 1 day, click on the 1 bar, click on the SQL you want to focus on, click back to timeslice and you will see a "cleaned" up chart that only shows that SQL activity throughout the day

    This is a tough one, but the product is working as designed. Might be nice to be able to toggle on "other" to see relative wait times, but then be able to toggle it off to focus on the primary performance hitters. Feature request?

  • >Drill into tighter and tighter time frames to see top 15 specific for that timeframe

    The issue is that without the "other" showing up, I don't know where I need to drill into.

    >If we were to show the top 15 + "other" in many systems with something like an ad hoc workload, you'd lose the relative impact of the top 15 (they'd be specs in a sea of gray).

    >Might be nice to be able to toggle on "other" to see relative wait times, but then be able to toggle it off to focus on the primary performance hitters.

    I can understand the problem, and an easily accessible toggle (like how "View excluded SQL" is handled) would probably be a good way to deal with it.

    New feature request has been submitted:

  • I guessed that it was a design choice.

    I think that there are at least three options for how to handle this;

    1. One option would be for each minimum period (a minute ?) to accumulate the top X items during that period. Then, when any longer period is shown, the top X items for the longer period will be shown individually, but anything that was included in the short period but is not in the top X items for the longer period would be displayed in an "Other" section shown at the top of each chart.

    This would NOT show the total wait time, there would be plenty of queries that would not show because they never made the top X in any time period, but this is probably not a major issue. This would also vary a little with time since the first period would only include the top X items during that period without showing any Other section but, if the second period had a slightly different selection of items, the first period would then have an other section. However, the selected items for each period would not vary, just the contents of the Other section. I think that this would have Other sections that were roughly in scale with the other sections, rather than overwhelming them.

    If 1 minute is the minimum and it were set to only show the top 3 items, it might look like this;

    pastedImage_0.png

    2. The simplest approach is probably to allow the Other section be toggled on and off. My main concern with this is that, if ALL of the other wait times were in the Other category, this might overwhelm the selected items, making it feel like dealing with the selected items is pointless. However, I do not know what proportion of the total wait time is usually displayed in the top 20 item list. If the selected items typically account for 50% of the total wait time, this is much less of an issue than if they only account for 1%

    3. Having a dual scale so that the Total Wait can be shown without conflicting with the Selected Item Waits... something like this;

    pastedImage_2.png

    I am not sure which would be the most practical and useful options so I am curious as to what the community thinks.

  • siggy's line for Total Wait is brilliant! I like his option #1 as well, but that might be complex to implement. I'd prioritize #2 over #1 but would like them both.