cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Custom Alert - Average Blocking Time per Session

Description

This alert will calculate the average blocking time per session and helps you understand the severity of the issue. It is similar to the Total Wait Time for Blocking - Repository alert in that it uses the data already in the repository, but instead of returning a total amount of blocking time, it divides that by the number of sessions being blocked. For example, there could be 200 sessions being blocked for 2-3 seconds, and you would get 600 seconds of blocking and be notified with the other alert mentioned. However, the problem is not too bad because each individual session was not blocked very long. However, if the scenario was instead 20 sessions blocked for 600 seconds, this alert would give you 30 seconds on average which may be a different severity.

Also see the Alternative section below if you want to retrieve the session with the maximum blocking time, i.e. the worst case scenario.

Alert Definition

To create the alert, click on Alerts > Manage Alerts tab and create a Custom Alert of type Custom SQL Alert - Single Numeric Return. Configure the alert with values similar to these:

Alert Name: Blocking Time per Session

Execution Interval: <= 10 Minutes down to 1 Minute

Notification Text: This instance just had a blocking problem where the average blocking time per session is below.

Active Database Instances: Add the Ignite Repository DB only

SQL Statement (Pick the proper statement below depending on if your Repository is on SQL Server or Oracle)


SQL Server Repository Statement:

SELECT COALESCE(AVG(timesecs), 0) avg_blocking_timesecs

FROM (

   SELECT sw.vdsi, ISNULL(SUM(sw.qp)/100,0) timesecs

   FROM consw_#DBID# sw

   INNER JOIN conev_#DBID# ev ON sw.keeq = ev.id

   WHERE sw.d >= DATEADD(n, -#FREQUENCY#, CURRENT_TIMESTAMP)

   AND ev.name LIKE 'LCK%'

   GROUP BY sw.vdsi) sw

Oracle Repository Statement:

SELECT NVL(AVG(timesecs), 0) avg_blocking_timesecs

FROM (

   SELECT sw.vdsi, NVL(SUM(sw.qp)/100,0) timesecs

   FROM consw_#DBID# sw

   INNER JOIN conev_#DBID# ev ON sw.keeq = ev.id

   WHERE sw.d >= CURRENT_TIMESTAMP - (#FREQUENCY# / 1440)

   AND ev.name LIKE 'enq%'

   GROUP BY sw.vdsi) sw


Execute Against: Repository

Units: Average Blocking Time (secs)

High Threshold: Min 30, Max empty

Medium Threshold: Min 15, Max 30

Alternative

Instead of summing the time with:

SELECT COALESCE(SUM(timesecs) / COUNT(1), 0) avg_blocking_timesecs

you could also use a MAX function to get the worst case blocking scenario with:

SELECT COALESCE(MAX(timesecs),0) max_blocking_timesecs

Comments

Every night, SSIS runs a procedure called SSISDB.internal.cleanup_server_retention_window, which blocks executions of SSISDB.internal.append_event_message for several minutes. I would like to add a WHERE clause to this custom alert query to filter out these nightly events. Which table and column name would I need to check for that procedure name?

Thanks,

Mark

Mark, is your repo on SQL or Oracle?  If you can grab the SQL hash value that we assign in the tool, then we can exclude that hash by referencing the consw_id.izho column.  So something like where sw.izho <> hash_number

Thanks for the quick response! I'm on SQL Server.

I can add "AND sw.izho <> 6041431186" to the WHERE clause, for example, but if the hash changes due to a restart or memory flush, etc., this won't help. Also, the hash will be different on each server. Is there a join I can add that will get me the procedure name? I don't know the Ignite repository.

You are correct that if the statement changes, it will get a different hash.  Regarding joining for the procedure name, try joining to the const_id table using the consw.izho and const.h columns.  In const, you should be able to pull the pname column for the SP.

That didn't seem to work, but I put in a hack that does. I added "AND CONVERT(TIME, GETDATE()) > CONVERT(TIME, '00:10')" to the query to suppress the alert for this situation, which always occurs between midnight and 12:10am. I'd prefer something more targeted, but this will get me by until I can install the SP2 for SQL Server 2012, which should fix the root cause of the problem.

References:

I followed the instructions when setting up the alert, but it's not working at all -- I set the threshold to 1 second to test, and no alerts. I cut and pasted the Oracle code because that's our repository platform. I tried running the sql provided at the command line and sqlplus doesn't like the "#" symbols -- are we supposed to substitute what's in between the pound symbols with values?  Let me know, thanks.

I think the issue is that the script had like 'LCK%' as the wait and with Oracle, that should have been 'enq%'.

I've edited the alert on Thwack.  Should be like below.  Let me know if this still errors.

If you want to test, you can run a "select id, name from cond" and substitute the id number for #DBID#, example:  conev_1

SELECT NVL(AVG(timesecs), 0) avg_blocking_timesecs

FROM (

   SELECT sw.vdsi, NVL(SUM(sw.qp)/100,0) timesecs

   FROM consw_#DBID# sw

   INNER JOIN conev_#DBID# ev ON sw.keeq = ev.id

   WHERE sw.d >= CURRENT_TIMESTAMP - (#FREQUENCY# / 1440)

   AND ev.name LIKE 'enq%'

   GROUP BY sw.vdsi) sw

Thanks mandevil -- that did  the trick!

Here is an example that alerts on any blocked session greater than 60 seconds for a given day between 6am and 8pm and then alerts on any blocked session above 300 seconds from 8pm to 6am.  If the return value is true the alert will return a 1 so you'll need to make sure your Min high value is set to 1.  Just another example of how you can turn your DPA alerts to be multi-conditional.

DECLARE

@MAX_BLK_TIME INT,

@HOUR INT,

@THRESHOLD INT

BEGIN

SELECT @HOUR = DATEPART(HOUR,CURRENT_TIMESTAMP);

SELECT @MAX_BLK_TIME = COALESCE(MAX(TIMESECS), 0)

FROM (

SELECT SW.VDSI, ISNULL(SUM(SW.QP)/100,0) TIMESECS

FROM CONSW_#DBID# SW

INNER JOIN CONEV_#DBID# EV ON SW.KEEQ = EV.ID

WHERE CAST(SW.D AS DATE) >= CAST(GETDATE() AS DATE)

AND EV.NAME LIKE 'LCK%'

GROUP BY SW.VDSI) SW;

-- SET THE THRESHOLD TO 60 SECONDS BETWEEN 6AM TO 8PM ELSE SET IT TO 300

IF (@HOUR >= 6 AND @HOUR <= 20)

                                SELECT @THRESHOLD = 60

ELSE

                                SELECT @THRESHOLD = 300;

IF @MAX_BLK_TIME >= @THRESHOLD

                                SELECT 1 --RETURN A 1 IF BLOCKING IS GREATER THEN 60 OR 300

                ELSE

SELECT 0

END

ashrecksiedler This is interesting actually I had seen an example where something like this was done with the built-in alert "Total Blocking Wait Time." Different values were needed for some times of the day so two copies of the alert were made with blackouts for the alert done on whatever one had the incorrect amount of blocking for a period.

I like this solution much better.

Hi ,

We have setup the job and found it is useful (we are running SQL2014)

but i have a problem,as this alert is to SQL instance level in configuration,but there is two database(A & B) running on this instance. is there a way to limit this alert to database A only? We do not expect to trigger any alert to database B when it reachs threshold for this blocking alert.

Thanks.

SQL Server Repository Statement:

SELECT COALESCE(AVG(timesecs), 0) avg_blocking_timesecs

FROM (

   SELECT sw.vdsi, ISNULL(SUM(sw.qp)/100,0) timesecs

   FROM consw_#DBID# sw

   INNER JOIN conev_#DBID# ev ON sw.keeq = ev.id

   WHERE sw.d >= DATEADD(n, -#FREQUENCY#, CURRENT_TIMESTAMP)

   AND ev.name LIKE 'LCK%'

   GROUP BY sw.vdsi) sw

The database info is stored in the CONSW_XX.IXOY column, and it joins to the CONO_XX table. Adding that join gives us something like this where you can replace 'master' with whatever you need:

SELECT COALESCE(AVG(timesecs), 0) avg_blocking_timesecs

FROM (

   SELECT sw.VDSI, ISNULL(SUM(sw.QP)/100,0) timesecs

   FROM ignite.CONSW_#DBID# sw

   INNER JOIN ignite.CONEV_#DBID# ev ON sw.KEEQ = ev.ID

   INNER JOIN ignite.CONO_#DBID# db ON sw.IXOY = db.ID

   WHERE sw.D >= DATEADD(n, -#FREQUENCY#, CURRENT_TIMESTAMP)

   AND ev.NAME LIKE 'LCK%'

   AND db.NAME = 'master'

   GROUP BY sw.VDSI) sw

Thank you very much!

CONO_#DBID# is a table with the O/S user name on an Oracle monitored instance and is the Database name in SQL server. Just wanted to mention this as this page has both Oracle and SQL examples.

one more question, is it possible to limit the monitoring scope to schema level in this alert job? like database A has two schemas: schema_a and schema_b , but i only need to monitor blocking performance of schema_a,but ignore to schema_b.

Thanks.

Version history
Revision #:
1 of 1
Last update:
‎12-23-2013 11:51 AM
Updated by: