3 Replies Latest reply on May 7, 2013 2:00 PM by marunderwood

    NodeID is blank in table Containers_AlertsAndReportsData

    marunderwood

      Some group alerts and reports are not working when I use the Containers_AlertsAndReportsData table.  Upon further investigation, I found that the NodeId in the Containers_AlertsAndReportsData is blank.  So for now, I'm having to use a Custom SQL Alert Join to the Nodes table to get the real NodeId so I can get Node information.  Why is this field not populated?

      Is this the best table to identify which members are in a group?

       

      Here is my alternative:

      After choosing Custom SQL Alert w/ trigger query of Group Member.

       

      LEFT JOIN Nodes npm on GroupMemberFullName = npm.Caption

       

      LEFT JOIN Events e on e.NetworkNode = npm.NodeId

       

      LEFT JOIN EventTypes et ON (e.EventType = et.EventType)

       

      WHERE

       

      ( datediff(ss,getdate(), EventTime) < 121) -- Occurred with the last 2 minutes

       

      AND e.EventType = 14 -- Reboot message (from Orion)

       

      AND

       

      (

       

       

      GroupName Like '%Production%'

       

      )

       

      p.s.  This example is to send a message when a production node is rebooted.

        • Re: NodeID is blank in table Containers_AlertsAndReportsData
          RichardLetts

          NetObjectType = 'C' which means this is a container, and you should join with the container, and not node...

           

          As far as I can tell there is no easy way to find the members of a container [Group in the UI] that caused group status to change

          best is containers inner join ContainerMemberSnapShots inner join Nodes where nodes.something=...

           

          How do you build members of the Production group[s]? -> can you use that instead to determine if the node-level alert should fire?

           

          Groups IMHO, are rarely a good thing to trigger alerts off of because you only get the one alert for the group, not one alert per member.

          Where they do add value is where you have VRRP, or other redundant paths. An example is putting all your border routers in a group and triggering an alert via your SMS server if they all are down.

          1 of 1 people found this helpful
            • Re: NodeID is blank in table Containers_AlertsAndReportsData
              marunderwood

              Thank you!!  I'm using the Container and ContainerMemberSnapshots instead.  I also switched to use the Node as the Trigger Query table.  Once I get it tested, I'll repost the query but it's looking promising.

               

              To answer your questions:

              1. To build the groups, there are a couple dynamic queries and several nodes were manually added.  Our naming conventions have changed, so there isn't a good method.
              2. So no, I wouldn't want to have to repeat that logic & maintain it another place.
              3. Also, I realize that triggering at the group level does not provide much detail.  That is why I was using a group member level ... hoping to get at the node level.  For now, the group is the only way I have to determine if a machine is in the production group or not.  [In the future, I might add a custom field or two on the Node to indicate this, rather than using the group, but for now I have a couple different IT groups so they are also divided between these.  And, to complicate things, some servers are used by both IT groups.]

               

              I still think there's a bug because .. when I used the Group Member and switched to the Custom SQL, it was trying to use the NodeID from the Containers_AlertsAndReportsData to join to the NODE table.  Since it was blank, I was never getting any results.  (NetObjectType = 'C' is not in this table.)

                • Re: NodeID is blank in table Containers_AlertsAndReportsData
                  marunderwood

                  Here is the final alert that will detect a reboot of a production node and send an email to the email(s) assigned to the node.  [p.s. I tried to use a variable, instead of "60" seconds, but I couldn't select the interval from the Alert Definition table ("SELECT ...from AlertDefiniition where AlerName = ${AlertName}") (It didn't like the variable name.) ]

                   

                  Choose:  Custom SQL Alert

                       Set up your Trigger Query:  Node

                  -- SELECT Nodes.NodeID As NetObjectID, Nodes.Caption AS Name

                  -- FROM Nodes

                   

                  LEFT JOIN ContainerMemberSnapshots cms on Nodes.NodeId = cms.EntityId

                   

                  LEFT JOIN Containers c on c.ContainerId = cms.ContainerId

                   

                  WHERE

                   

                  ( NodeId in

                   

                       ( select NodeId from AlertValueChanges where TableName = 'Nodes' and FieldName = 'LastBoot' and DateDiff(s, Changed, getdate()) <

                   

                            (60 * 1.2) -- ((select ExecuteInterval from AlertDefinitions where AlertName = ${AlertName}) * 1.2))

                   

                       )

                   

                  )

                   

                  AND

                   

                       (c.Name LIKE '%Prod%')

                   

                   

                   

                   

                   

                  /* Do not remove this comment HAS1101CHANGEDTBL|FLDNodes|LastBootHAS1101CHANGED */

                   

                  RESET CONDITION:

                   

                  WHERE ( NodeId not in

                   

                  ( select NodeId from AlertValueChanges where TableName = 'Nodes' and FieldName = 'LastBoot' and DateDiff(s, Changed, getdate()) <

                   

                  (60 * 3.0) -- ((select ExecuteInterval from AlertDefinitions where AlertName = ${AlertName}) * 1.2))

                   

                  ) -- Been past the trigger interval * 3.0 = Keep the alert "alive" for 3 minutes.

                   

                  )

                   

                  /* Note: You have to use the NOT IN feature because it doesn't work if you use >= on the time because there will probably be another record that IS greater, so the trigger won't fire. It will fire then reset immediately. */

                   

                  E-MAIL

                  To:  ${Node.AdminEmail},${Node.AppAdminEmail}

                  Subject:  Warning:  Production Node ${Caption} was rebooted.

                  Message:

                  Node: ${Caption} was rebooted on ${LastBoot}.

                  IP: ${IP_Address}
                  Location: ${Location}
                  Comments: ${Comments}
                  System Role: ${System_Role}
                  URL: ${NodeDetailsURL}

                   

                  Alert Name: ${AlertName}
                  Triggered:  ${AlertTriggerTime}