We would like to create an alert where if a node goes down in a group that was already down for a certain amount of time, it triggers a new action.
A group will be marked as a down if 1 or more nodes go down with in it.
We primarily use group alerting as we do not want to receive notifications for multiple devices in the same group going down at the same time.
A group is down for 20 minutes, a node that was up within the group previously goes down. this should create a new alert.
I would also be happy to switch from groups to custom properties if it would make it easier to action
Any assistance would be greatly appreciated.Thanks!
Solved! Go to Solution.
I would recommend testing this out, but I think this gets you in the right ballpark using a Custom SWQL Alert:
SWQL Query with added comments:
-- Inner Join the Nodes entity with the Container (Groups) entity, aligning on the NodeID and filtering to only Group Members that are actually nodes
JOIN Orion.Container AS Groups ON Groups.Members.MemberPrimaryID = Nodes.NodeID AND Groups.Members.MemberEntityType = 'Orion.Nodes'
-- Inner Join a sub-selection of the Container Status (Group Status) entity with the Container (Group), aligning on the ContainerID and filtering on Groups that have been Down (Status = 2) for over 20 minutes
JOIN ( SELECT ContainerID, MAX(DateTime) AS [LastUp] FROM Orion.ContainerStatus WHERE Status = 2 GROUP BY ContainerID HAVING MAX(DATETIME) < ADDMINUTE(-20,GETUTCDATE()) ) AS [GroupStatus] ON GroupStatus.ContainerID = Groups.ContainerID
-- Filter the Nodes entity to only find Nodes in a Down state (Status = 2)
WHERE Nodes.Status = 2
After much trial and error we couldn't get the query to work in the way we wanted, I have a suspicion that it may be due to our setup. However one of my colleagues did manage to get a different query working and I thought I'd share it.
SELECT Nodes.Uri, Nodes.DisplayName
FROM Orion.Nodes AS Nodes
LEFT JOIN Orion.Container AS Groups ON Groups.Members.MemberPrimaryID = Nodes.NodeID
LEFT JOIN (
SELECT SUBSTRING(Alerts.AlertObjects.EntityNetObjectId,3,10) AS ObjectID, Alerts.AlertObjects.EntityCaption, Alerts.TriggeredDateTime
FROM Orion.AlertActive AS Alerts
Alerts.AlertObjects.EntityType = 'Orion.Groups'
AND Alerts.AlertObjects.EntityCaption NOT LIKE '%-%'
AND Alerts.AlertObjects.AlertID = '446'
) AS ActiveGroupAlerts ON Groups.Name = ActiveGroupAlerts.EntityCaption
SELECT Events.Nodes.NodeID, MAX(Events.EventTime) AS TriggeredTime
EventType = 1
OR EventType = 2
GROUP BY Events.Nodes.NodeID
) AS StatusChange ON Nodes.NodeID = StatusChange.NodeID
Groups.NAME NOT LIKE '%-%'
AND Nodes.Status <> 1
AND Nodes.Status <> 9
AND Nodes.Status <> 11
AND ActiveGroupAlerts.TriggeredDateTime IS NOT NULL
AND MINUTEDIFF(ActiveGroupAlerts.TriggeredDateTime, StatusChange.TriggeredTime) >= 20
Hopefully this helps others!
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process. Learn more today by joining now.