cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Custom Alert - Long Running Queries

DESCRIPTION

This alert is for SQL Server instances only and will tell you when SQL statements have executed more than a defined amount of time. This example uses a 300 second threshold, meaning any SQL statements that have been executing for longer than 300 seconds (5 minutes) will be alerted. The alert query executes against the monitored database directly so caution should be taken on how often the alert is executed, especially if less than the default of every 300 seconds.

ALERT DEFINITION

To create the alert, click on Alerts > Manage Alerts tab and create a Custom Alert of type Custom SQL Alert - Multiple Numeric Return. Configure the alert with values similar to these:

Alert Name: SQL Server - Long Running Queries

Execution Interval: 5 Minutes

Notification Text: The following list of SQL statements have been running for over 5 minutes.

SQL Statement:

SELECT

'SPID='+CONVERT(VARCHAR,a.session_id)+' has been running the following for '+CONVERT(VARCHAR,DATEDIFF(SS, a.start_time, CURRENT_TIMESTAMP))+' seconds: '+convert(varchar,b.text), CONVERT(VARCHAR,DATEDIFF(SS, a.start_time, CURRENT_TIMESTAMP))

FROM sys.dm_exec_requests a

CROSS APPLY sys.dm_exec_sql_text(a.sql_handle) b

WHERE a.status <> 'background'

AND DATEDIFF(SS, a.start_time, CURRENT_TIMESTAMP) > 300 --This value of 300 must match the Execution Interval set in the alert

Execute Against: Monitored Database

Units: Seconds

High Threshold: MIN 300 (This value must match value found in query. Default: 300 ), MAX empty

Labels (1)
Tags (2)
Comments

Very helpful custom alert, thank you for providing that for community!

I just tried to extend this report e-mail output with long running query and hash added into the output:

SELECT 'session_id ' +

CONVERT(VARCHAR,a.session_id) + ' query_hash: '+ convert(varchar, a.query_hash, 2) +' seconds: ' + CONVERT(VARCHAR,DATEDIFF(SS, a.start_time, CURRENT_TIMESTAMP)) +' sql_text: '+ convert(varchar(8000),b.text) +' query_hash: '+ convert(varchar, a.query_hash, 2) +' query_plan_hash: ' + convert(varchar,a.query_plan_hash, 2), CONVERT(VARCHAR,DATEDIFF(SS, a.start_time, CURRENT_TIMESTAMP))

FROM sys.dm_exec_requests a

CROSS APPLY sys.dm_exec_sql_text(a.sql_handle) b

WHERE a.status <> 'background'

AND DATEDIFF(SS, a.start_time, CURRENT_TIMESTAMP) > 1

I tested that query on my database and got full queries text output,  what's happened with my ignite alert output it's just a short begin part of the query, the question - how is it possible to get full long running sql queries output in addition to the current e-mail report?

If you are still running as an alert, unfortunately, we are limited to 100 characters currently for the output.  That's part of our data model and probably not easily addressed.  Have you submitted this as a feature request here on Thwack?

Not yet, it would be great to get this feature in feature, how to submit it?

Thanks!

If you click on the link below, you can submit and it can get voted up.  Our product managers monitor these requests.

Database Performance Analyzer Feature Requests

Great possibility! Thanks!

Hello!

Could you answer to couple of questions,

1. I'm watching statistics of the  highest execution time available queries in Ignite -> Mydatabaseserver -> Current -> Long Running (Average) list. This is a highest average execution time window fixed by Ignite for the last 1 hour. What is the difference between "CURRENT_TIMESTAMP" value configured for checking in your alert and Long Running (Average) one hour monitoring current window check in Ignite, can you reconfigure your alert to the same "Long Running Average" check?

2. How can we build "Long Running Average" queries history log for all time, instead of just one hour real time Current window, do we need to create custom report, or maybe some Ignite features/custom reports already available?



Thanks!

1. The alerts and the calc for the long running list are not kept in sync.  I believe the long running average list is updated every 10 minutes (tied to our stats poll where we collect the number of executions).  This would make sense as we can't do an average execution time until we know how many times it got executed.  If you go to your logs directory, you should see a contime.csv file which you can open in excel.  You can use sorting to look for the activity "SQLSTATS-FETCH" and look at the latest runs to see when they are executing (for example 5:52, 6:02, 6:12, etc.). Then you can try to create your alert or activate it at the same time-ish and use a 10 minute interval.  Now when DPA gets cycled or when monitoring gets interrupted for whatever reason, the statspoll may shift right or left, but the alert likely will not, so there will be drift.  Short answer is that this would be very difficult to keep in sync with our current methodology (IMO).

2. If you have identified the long running average queries you want to focus more on, I'd recommend using the View Historical Chart feature in the tool which will provide average execution times per day for a historical setting.  Take a look at that and see if that meets what you need.

Thanks for so great explanation! I'm going to check it now,

Could you help with converting our alerts sql hash ' query_hash: ' which looks like that from e-mail -"65C0914DFC1B49C4" to ignite web sql hash which looks like that - ""4389872600" for next search in ignite? From our new custom modified alert:

SELECT 'session_id ' +

CONVERT(VARCHAR,a.session_id) + ' query_hash: '+ convert(varchar, a.query_hash, 2) +' query_plan_hash: ' + convert(varchar,a.query_plan_hash, 2) + ' seconds: ' + CONVERT(VARCHAR,DATEDIFF(SS, a.start_time, CURRENT_TIMESTAMP)) +' sql_text: '+ convert(varchar(8000),b.text) , CONVERT(VARCHAR,DATEDIFF(SS, a.start_time, CURRENT_TIMESTAMP))

FROM sys.dm_exec_requests a

CROSS APPLY sys.dm_exec_sql_text(a.sql_handle) b

WHERE a.status <> 'background'

AND DATEDIFF(SS, a.start_time, CURRENT_TIMESTAMP) > 3 -- for children of nature, more than 60 seconds execution time

Thanks!

That we can't do.  The query you are running for the alert as defined runs directly against the monitored instance.  The query_plan_hash being selected is directly from the sys.dm_exec_requests DMV and DPA isn't doing it's own hash as it never actually flows through our app (we only show what SQL shows us).

Sorry about that!

Hello,

this alert - is what I'm looking for, but having "non-standart" sql_hash makes it useless...

Isn't there any way to find the hash value like in a web presentation (I mean 10-digit format)?
Maybe, by making a 'join' (based, for instance, on session_id) with some other table? Or execute completely anoter query, on another tables?
Thanks!

It's on our roadmap to persist/display the SQL handle which would allow the bridge back to SQL Server or from SQL Server to DPA in this case.  However, that hasn't been implemented yet.  I respectfully disagree about the "uselessness" of this alert though. I'd assume you are running this to notify you that a SPID is running a query for longer than 5 minutes (as defined above).  When you get this alert, you can hop into DPA and filter down to the interval or timeframe this occurred, flip over to sessions tab, look for the appropriate SPID, click on it and get all the detail you want.  This is just letting you know there's something you need to take a look at...

Thanks for your answer!

It's good news that you are working on displaying SQL handle.

And your proposition about using sessioin to determine appropriate slow SQL looks great, I think this way problems can be determined fast enough.
By the way, can you tell please, for what period of time session_id is unique? For hour or day, etc.

Looking into the problem from the other side - can't we use query for alert like Ignite use for create daily report (to show "Top waits", for instance)?

SPIDs will get reused.  Remember, we're displaying what SQL Server is telling us.  A SPID could be used by app A between 2:01 and 2:03, then release it and then that same SPID will be used by app B between 2:03:01 and 2:03:05.  If applications are using connection pooling (best practice), the app hopping should not be significant.  So the answer to this one is it depends on your apps.  8 )

Can you clarify the next question.  Are you wanting to create a daily report to show top queries?  Or use the data from the alerts to build a report?

Expected behavior of SPID lifecycle)
About next question:
I'd like to have an alert with report-alike information. Or, in other words, I'd like to use report's query (with more stringent time range - not for all day, but for last 15 minutes, for instance) for alert.

Very possible.  Whatever TSQL you can come up with will work as long as the output is acceptable.

For the single numeric, that one's self-explanatory.  For the multi numeric, that one can have one alpha/numeric and one numeric value (the numeric always second).

In fact, we've had lots of folks grab some of our custom content and tweak for other uses.

That would be avesome.

I need to converse with my teammates to make a final decision what alert to use. Nevertheless, I'd like to work with report's SQL. Could you tell me, where can find it?
Thank you so much for your help!

Go here:  https://thwack.solarwinds.com/community/database-management/database-performance-analyzer and scroll down.  On the left hand side you will see a link to custom SQL content including reports, alerts and even metrics.  You can use them as is and plug them into DPA or you can tweak them per your own requirements.  You're most welcome!

That's very interesting, I just found a custom report Custom Alert - SQL Statements with Abnormal Execution Counts which outputs exactly required hash from Ignite to e-mail, it's maybe possible to use this modified report for our required alerts?

Thanks

For sure.  See constraints in my previous post though about outputs.

We just upgraded to the latest DPA version, many new features appeared, going to review this features which could help us with long running query monitoring.

Hi, Here's my version. Btw I think this should be an out of the box alert for DPA! I requested the feature. You should vote! https://thwack.solarwinds.com/ideas/5522

I added some important info (username/host/program/spid), eliminated duplicates, commented out the text cause it was causing problem and I could see it better on DPA session tab with the spid, excluded multiple thread for same spid for specific user because the loginame was blank on the other threads.

SELECT Distinct

     'LoginName: '+RTRIM(CONVERT(VARCHAR(50),s.loginame))+', Hostname: '+RTRIM(CONVERT(VARCHAR(50),s.hostname))+', ProgramName: '+RTRIM(CONVERT(VARCHAR(200),s.program_name))+', SPID: '+RTRIM(CONVERT(VARCHAR,s.spid))

     --+char(10)+'Duration: '+CONVERT(VARCHAR,DATEDIFF(SS, r.start_time, CURRENT_TIMESTAMP))+' seconds.'+char(10)

     --+CHAR(10)+convert(varchar(MAX),b.text)

     , CONVERT(VARCHAR,DATEDIFF(SS, r.start_time, CURRENT_TIMESTAMP))

FROM master..sysprocesses as s with(NOLOCK)

LEFT OUTER JOIN sys.dm_exec_requests AS r WITH(NOLOCK)

ON (s.spid = r.session_id)

CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) b

WHERE r.status <> 'background'

AND DATEDIFF(SS, r.start_time, CURRENT_TIMESTAMP) > 1

AND s.spid NOT IN (select distinct s.spid

                    from master..sysprocesses as s with(NOLOCK)

                    where s.loginame = 'DOMAIN\SqlServiceAccount')

I think that this is a good idea, but I think it needs refinement:

1. Without the query text, there is no way to know which query for a

SPID is running long. I can see why trying to include the query text would

be a problem, but you could include the handle (r.sql_handle) instead.

2. What is long for one query is not long for another. Perhaps you could

take the approach from the long-running agent jobs query and base the

threshold on, if I recall correctly, 2 standard deviations longer than the

average for the sql_handle. Of course, that won't help for

non-parameterized queries.

3. There is also a hard-coded value for the loginame value to exclude,

which I think would also make it impossible to make a standard part of the

product as is. But I can't speak for SolarWinds, I'm just a customer.

- Mark

I saw hfleitas​ version of this, but I don't understand how the query is defined as long running query without the 300 value of original query.

Anyways, I would like to have the username like "LoginName: '+RTRIM(CONVERT(VARCHAR(50),s.loginame))+'," in the original query, is that something possible?

New to MSSQL here.  this is a great alert and something I was looking for.  One issue I have is it automatically fires an alert for an idle session owned by the OS called 'SP_SERVER_DIAGNOSTICS_SLEEP'.  Nothing wrong with the session as far as I can tell, I think its necessary for AG health.  Any idea how I might exclude this one session from the check?

Version history
Revision #:
1 of 1
Last update:
‎02-06-2014 11:57 AM
Updated by: