cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Ultimate CPU Alert - No Custom SQL

Custom SQL is not your only hope for effective CPU alerts anymore! Updates to NPM and the SolarWinds database have made an easier way possible. Here is a fresh take on the Ultimate CPU Alert, created in hopes of making great CPU alerting easier to maintain and more accessible to SolarWinds users new and old.

First and foremost, you WILL need to set up the same SAM Perfmon Counter for processor queue length required for the original version of the alert. Please see @adatole 's wonderful post for instructions on how to do this.

Once the SAM template has been applied to the servers you wish to monitor, here's what the bones of your alert should look like:
PercentThresCPUAlert.PNG

That's it! This is a component based alert. Go ahead and throw your n_mute, Production_Environment, etc Custom Properties in there as you need. As a note for newer users- the second line of the trigger conditions is a double value comparison.

Our organization runs the below variation of the alert, which triggers off of Critical status of CPU instead of a set percentage threshold. Setting up your alert this way will allow you to fine tune your alert to fire on node specific thresholds!:

CritCPUAlert.PNG

 

I test ran this alert side by side with the custom SQL version of it for 3 days and it fired every time the OG custom SQL did.

For those of you taking advantage of the SolarWinds.APM.RealTimeProcessPoller.exe program to pull the top 10 processes into your notes and emails, getting the output of the program to write to the proper place requires some changes to the parameters passed to it. Use the following configuration:

C:\Program Files (x86)\SolarWinds\Orion\APM\SolarWinds.APM.RealTimeProcessPoller.exe -n=${N=SwisEntity;M=Application.Node.NodeID} -alert=${N=Alerting;M=AlertDefID} -alertID=${N=Alerting;M=AlertID} -activeObject=${N=SwisEntity;M=ComponentAlert.ComponentID} -timeout=120

If you opt to base your alert off CPU critical status, rather than a set percentage, I suggest pulling both the CPU Critical Level % and (if used) the number of consecutive polls of high usage required for Critical status into the body of your email via some SWQL magic like so:

CPU CRITICAL THRESHOLD: ${N=SWQL;M=SELECT thr.Level2Value
FROM Orion.CpuLoadThreshold thr
WHERE thr.Node.NodeID = ${N=SwisEntity;M=Application.Node.NodeID} }% for ${N=SWQL;M=SELECT thr.CriticalPolls
FROM Orion.CpuLoadThreshold thr
WHERE thr.Node.NodeID = ${N=SwisEntity;M=Application.Node.NodeID} } consecutive poll(s).

Inserting the above into the message section of your emails should produce the following results:


SWQLCPU.PNG

And that's all there is to it folks! I hope you (and your Jr. Admins) enjoy!

Tags (3)
Version history
Revision #:
1 of 1
Last update:
3 weeks ago
Updated by: