A super common question I see on Thwack is "I see I had high CPU load at 2am last night, but I don't know what processes caused the issue. How can I find out what caused it?"
I wrote this custom query to be used in conjunction this SAM template, CPU/MEM Process Snapshot
Once you assign that template to the nodes you would add this query to the Node Details view and then you can enter the amount of minutes you want to have it jump back in time and it will show you the top processes at that time. I find that it works best if you set it to show 12 rows (total cpu load, plus top 5 by cpu, plus total memory utilization, plus top 5 by memory).
First box
SELECT tolocal(TimeStamp) as Timestamp
--, cs.Component.Application.Node.Caption
--, cs.Component.Application.Name
, cs.Component.Name
, cs.Component.DetailsUrl as [_linkfor_Name]
, cs.DynamicEvidence.ColumnLabel as Label
, cs.DynamicEvidence.StringData as Message
FROM Orion.APM.ComponentStatus cs
where cs.DynamicEvidence.StringData is not null
and cs.Component.Application.Name like '%snapshot%'
--and cs.Component.Application.Node.NodeID=${NodeID}
and minutediff(timestamp,GETUTCDATE())<7
order by timestamp desc
, cs.Component.Name, label
and for the second search box use this query
SELECT tolocal(TimeStamp) as Timestamp
--, cs.Component.Application.Node.Caption
--, cs.Component.Application.Name
, cs.Component.Name
, cs.Component.DetailsUrl as [_linkfor_Name]
, cs.DynamicEvidence.ColumnLabel as Label
, cs.DynamicEvidence.StringData as Message
FROM Orion.APM.ComponentStatus cs
where cs.DynamicEvidence.StringData is not null
and cs.Component.Application.Name like '%snapshot%'
and cs.Component.Application.Node.NodeID=${NodeID}
and minutediff(timestamp,GETUTCDATE())<${SEARCH_STRING}
and minutediff(timestamp,GETUTCDATE())>(${SEARCH_STRING}-6)
order by timestamp desc--, caption, cs.Component.Application.Name
, cs.Component.Name, label
Loop1 Systems: SolarWinds Training and Professional Services
- LinkedIN: Loop1 Systems
- Facebook: Loop1 Systems
- Twitter: @Loop1Systems