3 Replies Latest reply on Jul 26, 2019 8:50 PM by ff_hernan

    Alert for node if group has been down for x minutes

    ff_hernan

      Hi all,

       

      We would like to create an alert where if a node goes down in a group that was already down for a certain amount of time, it triggers a new action.

       

      Context:

      A group will be marked as a down if 1 or more nodes go down with in it.

      We primarily use group alerting as we do not want to receive notifications for multiple devices in the same group going down at the same time.

       

      Example:

      A group is down for 20 minutes, a node that was up within the group previously goes down. this should create a new alert.

       

      I would also be happy to switch from groups to custom properties if it would make it easier to action

       

      Any assistance would be greatly appreciated.Thanks!

        • Re: Alert for node if group has been down for x minutes
          zackm

          I would recommend testing this out, but I think this gets you in the right ballpark using a Custom SWQL Alert:

           

           

           

          SWQL Query with added comments:

           

          -- Inner Join the Nodes entity with the Container (Groups) entity, aligning on the NodeID and filtering to only Group Members that are actually nodes
          JOIN Orion.Container AS Groups ON Groups.Members.MemberPrimaryID = Nodes.NodeID AND Groups.Members.MemberEntityType = 'Orion.Nodes'
          
          -- Inner Join a sub-selection of the Container Status (Group Status) entity with the Container (Group), aligning on the ContainerID and filtering on Groups that have been Down (Status = 2) for over 20 minutes
          JOIN ( SELECT ContainerID, MAX(DateTime) AS [LastUp] FROM Orion.ContainerStatus WHERE Status = 2 GROUP BY ContainerID HAVING MAX(DATETIME) < ADDMINUTE(-20,GETUTCDATE()) ) AS [GroupStatus] ON GroupStatus.ContainerID = Groups.ContainerID
          
          -- Filter the Nodes entity to only find Nodes in a Down state (Status = 2)
          WHERE Nodes.Status = 2 
            • Re: Alert for node if group has been down for x minutes
              ff_hernan

              Thanks, I'll have a look and test and let you know how we go!

                • Re: Alert for node if group has been down for x minutes
                  ff_hernan

                  After much trial and error we couldn't get the query to work in the way we wanted, I have a suspicion that it may be due to our setup. However one of my colleagues did manage to get a different query working and I thought I'd share it.

                  SELECT Nodes.Uri, Nodes.DisplayName

                  FROM Orion.Nodes AS Nodes

                  LEFT JOIN Orion.Container AS Groups ON Groups.Members.MemberPrimaryID = Nodes.NodeID

                  LEFT JOIN (

                    SELECT SUBSTRING(Alerts.AlertObjects.EntityNetObjectId,3,10) AS ObjectID, Alerts.AlertObjects.EntityCaption, Alerts.TriggeredDateTime

                    FROM Orion.AlertActive AS Alerts

                    WHERE

                        Alerts.AlertObjects.EntityType = 'Orion.Groups'

                    AND Alerts.AlertObjects.EntityCaption NOT LIKE '%-%'

                    AND Alerts.AlertObjects.AlertID = '446'

                  ) AS ActiveGroupAlerts ON Groups.Name = ActiveGroupAlerts.EntityCaption

                  LEFT JOIN(

                     SELECT Events.Nodes.NodeID, MAX(Events.EventTime) AS TriggeredTime

                     FROM Orion.Events

                     WHERE

                         EventType = 1

                      OR EventType = 2

                     GROUP BY Events.Nodes.NodeID

                  ) AS StatusChange ON Nodes.NodeID = StatusChange.NodeID

                  WHERE

                    Groups.NAME NOT LIKE '%-%'

                  AND Nodes.Status <> 1

                  AND Nodes.Status <> 9

                  AND Nodes.Status <> 11

                  AND ActiveGroupAlerts.TriggeredDateTime IS NOT NULL

                  AND MINUTEDIFF(ActiveGroupAlerts.TriggeredDateTime, StatusChange.TriggeredTime) >= 20

                   

                  Hopefully this helps others!