Version 4

    In my previous tool tip (Syslog Charts (also alerts, traps, events) I described ways to chart information (alert, syslog, traps).


    With some tweaking, we can add this information to the node details page to do analysis on who is generating and most alerts.


    Two general resources are needed.


    The first one points to issues with "node is down" alert.  The second one shows me that one particular node triggered that excessively.


    SELECT, count(*) as [AlertCount]

    from Orion.AlertHistory ah

    where DAYDIFF(ah.TimeStamp,getdate()) = 0 and is not NULL

    group by

    order by [AlertCount] desc


    ao.RelatedNodeCaption as [Node]

    ,ao.EntityDetailsUrl as [_LinkFor_Node]

    ,ao.AlertConfigurations.Name as [Alert Name]

    ,count (ao.AlertConfigurations.Name) as [count]

    FROM Orion.AlertObjects ao

    where DAYDIFF(ao.alerthistory.TimeStamp,GETDATE()) = 0

    group by ao.relatednodecaption, ao.AlertConfigurations.Name, ao.EntityDetailsUrl

    order by [count] desc


    Clicking on the node name brings us to the node details with the following two resources.



    select AlertHistory.AlertObjects.AlertConfigurations.Name as [Alert Name],


    AlertHistory.AlertObjects.EntityCaption as [Triggering Object],

    ToLocal(Timestamp) as [Time],

    AlertHistory.AlertObjects.RelatedNodeCaption as [Related Node],

    'https://insert server here/Orion/NetPerfMon/ActiveAlertDetails.aspx?NetObject=AAT:'+ToString(AlertObjectID) as [_linkfor_Message],

    'https://insert server here/Orion/NetPerfMon/ActiveAlertDetails.aspx?NetObject=AAT:'+ToString(AlertObjectID) as [_linkfor_Alert Name]

    from Orion.AlertHistory

    where AlertHistory.AlertObjects.RelatedNodeID='${NodeID}'

    and daydiff(AlertHistory.TimeStamp,getdate()) = 0

    and EventType = 0

    order by TimeStamp desc

    select convert(date,ah1.timestamp) [Date1]


    ,count(name) [Count of ]

    ,'total' [total]

          from AlertHistoryView ah1

      where DATEDIFF(day,ah1.timestamp,getdate()) < 30

            and ah1.RelatedNodeId = ${nodeid}

      group by convert(date,ah1.timestamp),


    It looks like the excessive triggers started on Sept 21 on node xyz.  This gives me a good starting point on diagnosing this issue.


    In looking at the "Average Response time & Packet loss", there seems to be excessive packet loss.