We have a custom SQL alert (well, we actually have a few of them) that does a calculation that uses SAM to capture processor queue length and compare it against the total number of available CPUs plus the actual CPU utilization. (Thanks Leon Adato!) We noticed last week that a bunch of these alerts weren't clearing out when the conditions in trigger were no longer true. Thinking it was a possible app and/or DB issue we opened an incident with support. This was the query that was in the database for our trigger reset (as recorded in the database) when we had selected "Reset when trigger conditions are no longer true". Note the WHERE NOT in the Original Query below.
To fix our problem we changed the reset condition to "Reset this alert when the following conditions are met" which copies the trigger query to the reset tab. We modified the query as follows:
1) Changed the (nodes.CPU_Crit is null AND nodes.CPULoad > 90) to (nodes.CPU_Crit is null AND nodes.CPULoad < 90)
2) Changed the (nodes.CPU_Crit is not null AND nodes.CPULoad > nodes.CPU_Crit) to (nodes.CPU_Crit is not null AND nodes.CPULoad < nodes.CPU_Crit)
3) Although we didn't have to change it when we copied it from the trigger actions, the reset query is now a WHERE not a WHERE NOT.
Question: How do you handle custom SQL alert reset conditions?
Original Query
SELECT Nodes.NodeID AS NetObjectID, Nodes.Caption AS Name
FROM Nodes /*SplitMarker*/inner join APM_AlertsAndReportsData on (Nodes.NodeID = APM_AlertsAndReportsData.NodeId)
INNER join (select c1.NodeID, COUNT(c1.CPUIndex) as CPUCount
from (select DISTINCT CPUMultiLoad.NodeID, CPUMultiLoad.CPUIndex
from CPUMultiLoad) c1
group by c1.NodeID) c2 on Nodes.NodeID = c2.NodeID
WHERE NOT
- Nodes.n_mute <> 1
AND Nodes.Prod_State = 'PROD'
AND APM_AlertsAndReportsData.ComponentName = 'Win_Processor_Queue_Len'
AND APM_AlertsAndReportsData.StatisticData > c2.CPUCount
AND
(
(nodes.CPU_Crit is null
AND nodes.CPULoad > 90)
OR (nodes.CPU_Crit is not null
AND nodes.CPULoad > nodes.CPU_Crit)
)