Is there a standard alert that informs the administrator when a DPA monitor is OFF or unresponsive? Or is it possible to create an alert that will effectively monitor the monitors?
I was also attempting to figure out a way to add this alert, and it took me a while to figure it out.
Here is what I had to do in the end to figure this out.
Method 1: I used this first but did not like it.
Run a query against the dpa repo to see if any database is in a state that is not "start". I didn't like this method because you cannot join on the #DBID# since all alerts stop for a db if it is not in a START state. If you did join on #DBID#, the alert would never trigger.
So, for this method, I had to:
This is the query I used here, quite simple: SELECT FailLog.CONTEXT AS Database_Name, COUNT(FailLog.ID) AS Failure_Count FROM ignite.CONLOG FailLog WHERE FailLog.PRIORITY IN ('INFO', 'WARN') AND FailLog.MODULE = 'DatabaseMonitor' AND FailLog.MESSAGE LIKE '%failed to start%' -- Time Logic: Entry is "within past 3 hours" BUT "older than 2 hours" AND FailLog.LOG_DATE BETWEEN DATEADD(HOUR, -3, GETDATE()) AND DATEADD(HOUR, -2, GETDATE()) -- Exclude if it successfully started AFTER the failure AND NOT EXISTS ( SELECT 1 FROM ignite.CONLOG SuccessLog WHERE SuccessLog.CONTEXT = FailLog.CONTEXT AND SuccessLog.MESSAGE LIKE 'Started monitoring database%' AND SuccessLog.LOG_DATE > FailLog.LOG_DATE -- Must be LATER than the failure ) GROUP BY FailLog.CONTEXT HAVING COUNT(FailLog.ID) > 0;
Method 2: I love this method and it works perfectly.
Pre-req note: Solarwinds does not recommend monitoring your dpa databases with DPA.
Pre-req: Since there was no built-in method to do this simple check against the dpa repo, I used 1 license to add my dpa db into DPA. After adding, I disabled as much as I could for that database:
The instance assigned to this alert is the dpa database.
Here is the query I wrote to trigger this alert: SELECT FailLog.CONTEXT AS Database_Name, COUNT(FailLog.ID) AS Failure_Count FROM ignite.CONLOG FailLog INNER JOIN ignite.COND MonitoredDB ON MonitoredDB.NAME = FailLog.CONTEXT INNER JOIN ignite.CONLIC_INSTANCE_ALLOCATION License ON License.CONSUMER_ID = CAST(MonitoredDB.ID AS VARCHAR(50)) + '|A' WHERE FailLog.PRIORITY IN ('INFO', 'WARN') AND FailLog.MODULE = 'DatabaseMonitor' AND ( FailLog.MESSAGE LIKE '%failed to start%' OR FailLog.MESSAGE LIKE '%username and/or password must be updated%' OR FailLog.MESSAGE LIKE 'Monitor for database%stopped%' ) -- AND FailLog.LOG_DATE > DATEADD(MINUTE, -60, GETDATE()) --AND MonitoredDB.ID = #DBID# -- Exclude if it successfully started AFTER the failure/stop event AND NOT EXISTS ( SELECT 1 FROM ignite.CONLOG SuccessLog WHERE SuccessLog.CONTEXT = FailLog.CONTEXT AND SuccessLog.MESSAGE LIKE 'Started monitoring database%' AND SuccessLog.LOG_DATE > FailLog.LOG_DATE ) GROUP BY FailLog.CONTEXT HAVING COUNT(FailLog.ID) > 0;
Ensure you do not add the alert to any alert groups, keep it separate. Otherwise, it'll trigger multiple times; once per db in that group.
Alert Levels:
I also wrote a custom template and assigned it to this alert (for Method 2):
Subject: DPA Alert: [=alert.alertName] - [=alert.status.value]
Body text:
<b>Alert:</b> [=alert.alertName] <b>Database Instance(s):</b> [#list alert.results as result] • [#if result.parameterName??][=result.parameterName][#else]Unknown[/#if] [/#list] <b>Execution Time:</b> [=alert.executionTime]
[#if alert.notificationText?has_content] <b>Remediation Steps: </b> [=alert.notificationText] [/#if]
<i>This message was system-generated. Do not reply to this message.</i> Hope this works for you as well as it has worked for me.