1 of 1 people found this helpful
So there's a couple ways to approach this. You could write the report to search through every cpu poll looking for the times when the system goes above/below the thresholds but it will be a lot more computationally efficient to set up an alert at that threshold and then just diff the timestamps between those alert triggers and resets. I've posted this custom SWQL report a couple times already on Thwack but if you just take it and filter it to your CPU load alerts it should do what you are asking.
--report on alerts triggered
,'/Orion/NetPerfMon/ActiveAlertDetails.aspx?NetObject=AAT:'+ToString(AlertObjectID) as [_linkfor_Name]
,EntityCaption as [Trigger Object]
,EntityDetailsUrl as [_linkfor_Trigger Object]
WHEN RelatedNodeCaption=EntityCaption THEN 'Self'
When RelatedNodeCaption!=EntityCaption THEN RelatedNodeCaption
End as [Parent Node]
,RelatedNodeDetailsUrl as [_linkfor_Parent Node]
,'/Orion/images/StatusIcons/Small-' + p.StatusIcon AS [_IconFor_Parent Node]
,tostring(tolocal(ah.TimeStamp)) as [Trigger Time]
,case when ack.timestamp is null then 'N/A'
end as [Minutes Until Acknowledged]
,ack.Message as [Note]
,case when reset.timestamp is null then 'N/A'
end as [Minutes Until Reset]
FROM Orion.AlertHistory ah
left join Orion.AlertObjects ao on ao.alertobjectid=ah.alertobjectid
left join Orion.AlertConfigurations ac on ac.alertid=ao.alertid
left join Orion.Actions a on a.actionid=ah.actionid
left join Orion.Nodes p on p.nodeid=RelatedNodeID
left join (select timestamp, AlertActiveID, AlertObjectID,message from orion.alerthistory ah where eventtype=2) ack on ack.alertactiveid=ah.AlertActiveID and ack.alertobjectid=ah.AlertObjectID
left join (select timestamp, AlertActiveID, AlertObjectID from orion.alerthistory ah where eventtype=1) reset on reset.alertactiveid=ah.AlertActiveID and reset.alertobjectid=ah.AlertObjectID
--and (ac.Name like '%YourCPUAlert%')
order by ah.timestamp desc
I am learning more every day .... thanks for taking the time to share this information - when I take the time, every little piece of code that I collect will help me achieve something bigger and better down the road. I am really digging this community! I have had more time to take advantage of THWACK, and it is inevitably making me a better employee!
The great thing about Solarwinds is its visual approach to data.
I would simply create a chart based report for CPU utilization across a group of nodes that has thresholds marked on it. You will be able to clearly see when nodes breach the threshold and when the cpu utilization returns down below the threshold. Doing it this way saves you having to think about coding when most of the work is already in the standard reports in Solarwinds. You can run this report per node or based upon a group of nodes.
If it isn't easy to see how many threshold breaches occur then its time to reduce the time period displayed on the graph until the information you want is displayed clearly. You could design some fancy script to do this for you but I wouldn't bother, i'd just send this to myself once a week and look at the peaks to determine where high CPU utilization occurs.