Alert Charts for Node Details

Version 4

    In my previous tool tip (Syslog Charts (also alerts, traps, events) I described ways to chart information (alert, syslog, traps).

     

    With some tweaking, we can add this information to the node details page to do analysis on who is generating and most alerts.

     

    Two general resources are needed.

     

    The first one points to issues with "node is down" alert.  The second one shows me that one particular node triggered that excessively.

    PictureCode

    SELECT ah.AlertObjects.AlertConfigurations.name, count(*) as [AlertCount]

    from Orion.AlertHistory ah

    where DAYDIFF(ah.TimeStamp,getdate()) = 0 and ah.AlertObjects.AlertConfigurations.name is not NULL

    group by ah.AlertObjects.AlertConfigurations.name

    order by [AlertCount] desc

    SELECT

    ao.RelatedNodeCaption as [Node]

    ,ao.EntityDetailsUrl as [_LinkFor_Node]

    ,ao.AlertConfigurations.Name as [Alert Name]

    ,count (ao.AlertConfigurations.Name) as [count]

    FROM Orion.AlertObjects ao

    where DAYDIFF(ao.alerthistory.TimeStamp,GETDATE()) = 0

    group by ao.relatednodecaption, ao.AlertConfigurations.Name, ao.EntityDetailsUrl

    order by [count] desc

     

    Clicking on the node name brings us to the node details with the following two resources.

     

    PictureCode

    select AlertHistory.AlertObjects.AlertConfigurations.Name as [Alert Name],

    Message,

    AlertHistory.AlertObjects.EntityCaption as [Triggering Object],

    ToLocal(Timestamp) as [Time],

    AlertHistory.AlertObjects.RelatedNodeCaption as [Related Node],

    'https://insert server here/Orion/NetPerfMon/ActiveAlertDetails.aspx?NetObject=AAT:'+ToString(AlertObjectID) as [_linkfor_Message],

    'https://insert server here/Orion/NetPerfMon/ActiveAlertDetails.aspx?NetObject=AAT:'+ToString(AlertObjectID) as [_linkfor_Alert Name]

    from Orion.AlertHistory

    where AlertHistory.AlertObjects.RelatedNodeID='${NodeID}'

    and daydiff(AlertHistory.TimeStamp,getdate()) = 0

    and EventType = 0

    order by TimeStamp desc

    select convert(date,ah1.timestamp) [Date1]

         ,ah1.name

    ,count(name) [Count of ]

    ,'total' [total]

          from AlertHistoryView ah1

      where DATEDIFF(day,ah1.timestamp,getdate()) < 30

            and ah1.RelatedNodeId = ${nodeid}

      group by convert(date,ah1.timestamp), ah1.name

     

    It looks like the excessive triggers started on Sept 21 on node xyz.  This gives me a good starting point on diagnosing this issue.

     

    In looking at the "Average Response time & Packet loss", there seems to be excessive packet loss.

    .

     

    -Thanks

    Amit