This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Alert for node if group has been down for x minutes

Hi all,

We would like to create an alert where if a node goes down in a group that was already down for a certain amount of time, it triggers a new action.

Context:

A group will be marked as a down if 1 or more nodes go down with in it.

We primarily use group alerting as we do not want to receive notifications for multiple devices in the same group going down at the same time.

Example:

A group is down for 20 minutes, a node that was up within the group previously goes down. this should create a new alert.

I would also be happy to switch from groups to custom properties if it would make it easier to action

Any assistance would be greatly appreciated.Thanks!

  • I would recommend testing this out, but I think this gets you in the right ballpark using a Custom SWQL Alert:

    pastedImage_0.png

    SWQL Query with added comments:

    -- Inner Join the Nodes entity with the Container (Groups) entity, aligning on the NodeID and filtering to only Group Members that are actually nodes
    JOIN Orion.Container AS Groups ON Groups.Members.MemberPrimaryID = Nodes.NodeID AND Groups.Members.MemberEntityType = 'Orion.Nodes'

    -- Inner Join a sub-selection of the Container Status (Group Status) entity with the Container (Group), aligning on the ContainerID and filtering on Groups that have been Down (Status = 2) for over 20 minutes
    JOIN ( SELECT ContainerID, MAX(DateTime) AS [LastUp] FROM Orion.ContainerStatus WHERE Status = 2 GROUP BY ContainerID HAVING MAX(DATETIME) < ADDMINUTE(-20,GETUTCDATE()) ) AS [GroupStatus] ON GroupStatus.ContainerID = Groups.ContainerID

    -- Filter the Nodes entity to only find Nodes in a Down state (Status = 2)
    WHERE Nodes.Status = 2
  • Thanks, I'll have a look and test and let you know how we go!

  • After much trial and error we couldn't get the query to work in the way we wanted, I have a suspicion that it may be due to our setup. However one of my colleagues did manage to get a different query working and I thought I'd share it.

    SELECT Nodes.Uri, Nodes.DisplayName

    FROM Orion.Nodes AS Nodes

    LEFT JOIN Orion.Container AS Groups ON Groups.Members.MemberPrimaryID = Nodes.NodeID

    LEFT JOIN (

      SELECT SUBSTRING(Alerts.AlertObjects.EntityNetObjectId,3,10) AS ObjectID, Alerts.AlertObjects.EntityCaption, Alerts.TriggeredDateTime

      FROM Orion.AlertActive AS Alerts

      WHERE

          Alerts.AlertObjects.EntityType = 'Orion.Groups'

      AND Alerts.AlertObjects.EntityCaption NOT LIKE '%-%'

      AND Alerts.AlertObjects.AlertID = '446'

    ) AS ActiveGroupAlerts ON Groups.Name = ActiveGroupAlerts.EntityCaption

    LEFT JOIN(

       SELECT Events.Nodes.NodeID, MAX(Events.EventTime) AS TriggeredTime

       FROM Orion.Events

       WHERE

           EventType = 1

        OR EventType = 2

       GROUP BY Events.Nodes.NodeID

    ) AS StatusChange ON Nodes.NodeID = StatusChange.NodeID

    WHERE

      Groups.NAME NOT LIKE '%-%'

    AND Nodes.Status <> 1

    AND Nodes.Status <> 9

    AND Nodes.Status <> 11

    AND ActiveGroupAlerts.TriggeredDateTime IS NOT NULL

    AND MINUTEDIFF(ActiveGroupAlerts.TriggeredDateTime, StatusChange.TriggeredTime) >= 20

    Hopefully this helps others!