in Part 1, I showed you how to get the data needed to make an Enterprise SQL job manager using DPA. Now that it is out of the way, I will show you the inner workings to making this whole thing worth your while. The following is what the bulk of the work will look like
Truncate JobFailure in DB: pretty simply, don't want to repeat the same data every hour
code: truncate table dbo.JobFailure
Get and Load Data: pretty simply as well. it is a data task flow that goes to the dpa repository and pull it back any job failure within the last hour and store it in the destination database
code:
select
ah.DBNAME
,max(ah.ACTIONDATE) as ActionDate
,ahr.PARAMETERNAME
from CON_ALERT a
inner join CON_ALERT_HISTORY ah
on a.ID = ah.ALERTID
inner join CON_ALERT_HISTORY_RESULTS ahr
on ah.HISTORYID = ahr.HISTORYID
where a.ALERTNAME = 'SQL Server Job Failure'
and ACTIONDATE >= DATEADD(HOUR,-1, dateadd(hour, datediff(hour, 0, getdate()),0))
group by ah.DBNAME
, ahr.PARAMETERNAME
ORDER BY ActionDate desc
Update FailCounter: exactly what it say, update the FailCounter
Code:
update t
set t.FailCounter = t.FailCounter + 1
from Jobs t
join vw_FailedJobs s
on t.InstanceName = s.InstanceName
and t.JobName = s.JobName
Send Email: this one has some logic to it.
code:
DECLARE @failcounter INT
,@failthreshold INT
,@isalertactive CHAR(1)
,@instancename VARCHAR(100)
,@jobname VARCHAR(256)
,@stepname varchar(256)
,@joberr VARCHAR(4000)
,@support VARCHAR(500)
,@subject VARCHAR(500)
,@body VARCHAR(4000)
,@count int
set nocount on
select @count = count(*) from vw_FailedJobs
WHILE (@count != 0)
BEGIN
SELECT @failcounter = j.FailCounter
,@failthreshold = j.FailThreshold
,@isalertactive = j.IsAlertActive
,@instancename = j.InstanceName
,@jobname = v.JobName
,@stepname = v.StepName
,@joberr = v.JobErr
,@support = j.SupportTeam
FROM Jobs j
JOIN vw_FailedJobs v
ON j.InstanceName = v.InstanceName
AND j.JobName = v.JobName
WHERE v.rn = @count
IF (
@failcounter >= @failthreshold
AND @isalertactive = 'Y'
AND @support IS NOT NULL
)
BEGIN
SET @subject = '[Red] - job failure. ' + @jobname + ' failed on ' + @instancename + '.'
SET @body = 'The job failed at step: ' + @stepname + char(13) + char(10) + '
The job failed with the following error: ' + CHAR(13) + char(13) +
@joberr
EXEC msdb.dbo.sp_send_dbmail @from_address = 'no.reply@company_name.com'
,@recipients = @support
,@subject = @subject
,@body = @body
END
set @count -= 1;
END
set nocount off
In the vw_FailedJobs view, there's a function that returns the job information that was stored in dpa. I got the code from someone at solarwinds and threw it into a function.
Update Failed Counter:
code:
update t
set t.failcounter = 0
from jobs t
join vw_FailedJobs s
on t.InstanceName = s.InstanceName
and t.JobName = s.JobName
join jobs t1
on t.FailThreshold = t1.FailCounter
That's pretty much it. This process can be expanded to track whether or not a job should of ran, whether a job should of been enabled but was disabled, run time duration, etc ...