cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Level 7

Alert for node if group has been down for x minutes

Jump to solution

Hi all,

We would like to create an alert where if a node goes down in a group that was already down for a certain amount of time, it triggers a new action.

Context:

A group will be marked as a down if 1 or more nodes go down with in it.

We primarily use group alerting as we do not want to receive notifications for multiple devices in the same group going down at the same time.

Example:

A group is down for 20 minutes, a node that was up within the group previously goes down. this should create a new alert.

I would also be happy to switch from groups to custom properties if it would make it easier to action

Any assistance would be greatly appreciated.Thanks!

0 Kudos
1 Solution

Accepted Solutions
Level 15

Re: Alert for node if group has been down for x minutes

Jump to solution

I would recommend testing this out, but I think this gets you in the right ballpark using a Custom SWQL Alert:

pastedImage_0.png

SWQL Query with added comments:

-- Inner Join the Nodes entity with the Container (Groups) entity, aligning on the NodeID and filtering to only Group Members that are actually nodes
JOIN Orion.Container AS Groups ON Groups.Members.MemberPrimaryID = Nodes.NodeID AND Groups.Members.MemberEntityType = 'Orion.Nodes'

-- Inner Join a sub-selection of the Container Status (Group Status) entity with the Container (Group), aligning on the ContainerID and filtering on Groups that have been Down (Status = 2) for over 20 minutes
JOIN ( SELECT ContainerID, MAX(DateTime) AS [LastUp] FROM Orion.ContainerStatus WHERE Status = 2 GROUP BY ContainerID HAVING MAX(DATETIME) < ADDMINUTE(-20,GETUTCDATE()) ) AS [GroupStatus] ON GroupStatus.ContainerID = Groups.ContainerID

-- Filter the Nodes entity to only find Nodes in a Down state (Status = 2)
WHERE Nodes.Status = 2

View solution in original post

0 Kudos
3 Replies
Level 15

Re: Alert for node if group has been down for x minutes

Jump to solution

I would recommend testing this out, but I think this gets you in the right ballpark using a Custom SWQL Alert:

pastedImage_0.png

SWQL Query with added comments:

-- Inner Join the Nodes entity with the Container (Groups) entity, aligning on the NodeID and filtering to only Group Members that are actually nodes
JOIN Orion.Container AS Groups ON Groups.Members.MemberPrimaryID = Nodes.NodeID AND Groups.Members.MemberEntityType = 'Orion.Nodes'

-- Inner Join a sub-selection of the Container Status (Group Status) entity with the Container (Group), aligning on the ContainerID and filtering on Groups that have been Down (Status = 2) for over 20 minutes
JOIN ( SELECT ContainerID, MAX(DateTime) AS [LastUp] FROM Orion.ContainerStatus WHERE Status = 2 GROUP BY ContainerID HAVING MAX(DATETIME) < ADDMINUTE(-20,GETUTCDATE()) ) AS [GroupStatus] ON GroupStatus.ContainerID = Groups.ContainerID

-- Filter the Nodes entity to only find Nodes in a Down state (Status = 2)
WHERE Nodes.Status = 2

View solution in original post

0 Kudos
Highlighted
Level 7

Re: Alert for node if group has been down for x minutes

Jump to solution

Thanks, I'll have a look and test and let you know how we go!

0 Kudos
Highlighted
Level 7

Re: Alert for node if group has been down for x minutes

Jump to solution

After much trial and error we couldn't get the query to work in the way we wanted, I have a suspicion that it may be due to our setup. However one of my colleagues did manage to get a different query working and I thought I'd share it.

SELECT Nodes.Uri, Nodes.DisplayName

FROM Orion.Nodes AS Nodes

LEFT JOIN Orion.Container AS Groups ON Groups.Members.MemberPrimaryID = Nodes.NodeID

LEFT JOIN (

  SELECT SUBSTRING(Alerts.AlertObjects.EntityNetObjectId,3,10) AS ObjectID, Alerts.AlertObjects.EntityCaption, Alerts.TriggeredDateTime

  FROM Orion.AlertActive AS Alerts

  WHERE

      Alerts.AlertObjects.EntityType = 'Orion.Groups'

  AND Alerts.AlertObjects.EntityCaption NOT LIKE '%-%'

  AND Alerts.AlertObjects.AlertID = '446'

) AS ActiveGroupAlerts ON Groups.Name = ActiveGroupAlerts.EntityCaption

LEFT JOIN(

   SELECT Events.Nodes.NodeID, MAX(Events.EventTime) AS TriggeredTime

   FROM Orion.Events

   WHERE

       EventType = 1

    OR EventType = 2

   GROUP BY Events.Nodes.NodeID

) AS StatusChange ON Nodes.NodeID = StatusChange.NodeID

WHERE

  Groups.NAME NOT LIKE '%-%'

AND Nodes.Status <> 1

AND Nodes.Status <> 9

AND Nodes.Status <> 11

AND ActiveGroupAlerts.TriggeredDateTime IS NOT NULL

AND MINUTEDIFF(ActiveGroupAlerts.TriggeredDateTime, StatusChange.TriggeredTime) >= 20

Hopefully this helps others!

0 Kudos