cancel
Showing results for 
Search instead for 
Did you mean: 

Custom Alert - Jobs running longer than usual

Custom Alert - Jobs running longer than usual

This computes the average duration of a job from the job history tables, and computes a number representing how many times longer it is currently running than usual.

For example.

Job1 usually takes 30 minutes.  If it takes 90 minutes, it will return 3.

It excludes jobs that take 30 minutes.  In my particular environment, that filters out some noise.

It excludes the Ola Hallengren index maintenance job.

Type: Custom SQL Alert - Multiple Numeric Return

SQL Statement:

SELECT job.name AS 'jobName',

(DATEDIFF(SECOND,ja.start_execution_date,GetDate()) ) / (art.avgRunTime) AS TimesLonger

FROM msdb.dbo.sysjobs AS job

    INNER JOIN

    (

        SELECT j.job_id,

               AVG((run_duration / 10000) * 3600 + (run_duration / 100 % 100) * 60 + run_duration % 100) AS 'avgRunTime' /* convert HHMMSS to seconds */

        FROM msdb.dbo.sysjobhistory jh

            INNER JOIN msdb.dbo.sysjobs j

                ON jh.job_id = j.job_id

        WHERE step_id = 0 -- only grab our total run-time

              AND run_status = 1 -- only grab successful executions

              AND msdb.dbo.agent_datetime(run_date, run_time) >= DATEADD(DAY, -30, GETDATE())

        GROUP BY j.job_id

    ) AS art

        ON job.job_id = art.job_id

    INNER JOIN msdb..sysjobactivity ja

        ON ja.job_id = job.job_id

WHERE job.enabled = 1   /* do not display disabled jobs*/

      AND NOT EXISTS

(    /* make sure this is the most recent run*/

    SELECT 1

    FROM msdb..sysjobactivity new

    WHERE new.job_id = ja.job_id

          AND new.start_execution_date > ja.start_execution_date

)

AND ja.start_execution_date IS NOT NULL

AND ja.stop_execution_date IS NULL /* has not stopped */

AND (DATEDIFF(SECOND,ja.start_execution_date,GetDate())) > (art.avgRunTime )

AND ja.session_id = (SELECT TOP 1 session_id FROM msdb.dbo.syssessions ORDER BY agent_start_date DESC) /* Most recent run */

/* Greater than 30 minutes */

AND (DATEDIFF(SECOND,ja.start_execution_date,GetDate())) > 1800

AND job.Name <> 'Maintenance | IndexOptimize - USER_DATABASES'

Medium level - Min: 3 Max: 5

High level - Min: 5 Max: Empty

Labels (1)
Comments

There is already a long running job alert inside of DPA. It defines "long" as a job that is more than 2 standard deviations from the mean over the past 90 days.

That alert does not allow me to exclude specific jobs.

glutenfreesql Thank you for the query. This is a nice alternative for those of us who do not have DPA.

I guess it makes sense to run this no more then every 30 minutes.

I'm not sure I understand your context for:

Medium level - Min: 3 Max: 5

High level - Min: 5 Max: Empty

Perhaps you mean that if I configure the alert to fire when the "statistic" (TimesLonger) is >= 3 < 5 it might be a Warning and >5 might be Critical; Is that what you mean?

Does it not return a value >5?

Does it not return a value <3?

Version history
Revision #:
1 of 1
Last update:
‎12-28-2017 01:22 PM
Updated by:
 
Contributors