This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Alert Charts for Node Details

In my previous tool tip (Syslog Charts (also alerts, traps, events) I described ways to chart information (alert, syslog, traps).

With some tweaking, we can add this information to the node details page to do analysis on who is generating and most alerts.

Two general resources are needed.

The first one points to issues with "node is down" alert.  The second one shows me that one particular node triggered that excessively.

PictureCode
pastedImage_1.png

SELECT ah.AlertObjects.AlertConfigurations.name, count(*) as [AlertCount]

from Orion.AlertHistory ah

where DAYDIFF(ah.TimeStamp,getdate()) = 0 and ah.AlertObjects.AlertConfigurations.name is not NULL

group by ah.AlertObjects.AlertConfigurations.name

order by [AlertCount] desc

pastedImage_8.png

SELECT

ao.RelatedNodeCaption as [Node]

,ao.EntityDetailsUrl as [_LinkFor_Node]

,ao.AlertConfigurations.Name as [Alert Name]

,count (ao.AlertConfigurations.Name) as [count]

FROM Orion.AlertObjects ao

where DAYDIFF(ao.alerthistory.TimeStamp,GETDATE()) = 0

group by ao.relatednodecaption, ao.AlertConfigurations.Name, ao.EntityDetailsUrl

order by [count] desc

Clicking on the node name brings us to the node details with the following two resources.

PictureCode
pastedImage_4.png

select AlertHistory.AlertObjects.AlertConfigurations.Name as [Alert Name],

Message,

AlertHistory.AlertObjects.EntityCaption as [Triggering Object],

ToLocal(Timestamp) as [Time],

AlertHistory.AlertObjects.RelatedNodeCaption as [Related Node],

'https://insert server here/Orion/NetPerfMon/ActiveAlertDetails.aspx?NetObject=AAT:'+ToString(AlertObjectID) as [_linkfor_Message],

'https://insert server here/Orion/NetPerfMon/ActiveAlertDetails.aspx?NetObject=AAT:'+ToString(AlertObjectID) as [_linkfor_Alert Name]

from Orion.AlertHistory

where AlertHistory.AlertObjects.RelatedNodeID='${NodeID}'

and daydiff(AlertHistory.TimeStamp,getdate()) = 0

and EventType = 0

order by TimeStamp desc

pastedImage_5.pngpastedImage_11.png

select convert(date,ah1.timestamp) [Date1]

     ,ah1.name

,count(name) [Count of ]

,'total' [total]

      from AlertHistoryView ah1

  where DATEDIFF(day,ah1.timestamp,getdate()) < 30

        and ah1.RelatedNodeId = ${nodeid}

  group by convert(date,ah1.timestamp), ah1.name

It looks like the excessive triggers started on Sept 21 on node xyz.  This gives me a good starting point on diagnosing this issue.

In looking at the "Average Response time & Packet loss", there seems to be excessive packet loss.

.

pastedImage_9.png

-Thanks

Amit