We have several additional polling engines in our environment. Recently we have been having a problem with one or two of the additional polling engines stop polling, and when we check the status of the polling engine in the web console the status is Down. The fix is to shut down all services and restart all services.
My question is... Is there a way to create an alert for an additional polling engine status? It would be nice to get an alert when the polling engine status changes so I can correct the problem before the customer calls and tells me that there is data missing from their graphs.... Any hints on how to create the alert are greatly appreciated.
Check each server logs for issues either with resources or the application itself. Check your polling rate/Job Weight as well in settings and off load nodes from your higest offender ---- if this problem node is carrying the most in job weight there is another underlying issue with it.
convert(varchar, round(nodes.systemuptime/60/60, 2, 1))+' hrs' as Uptime,
Engines.Elements as Elmts, Engines.Nodes, Engines.Interfaces as Int, Engines.Volumes as Vol,
c.custpolls as UnDP, a.samct as SAM,
N.Down_node, I.Down_Int, V.Down_vol, A2.Down_sam,
s.failed as noSNMP,
Engines.PollingCompletion as "%complt",
nodes.nodeid, nodes.CPUload as "%CPU", nodes.percentmemoryused as "%RAM",
e1.PropertyValue as NPM_Rate, e2.PropertyValue as SAM_Rate
join nodes on engines.ip = nodes.ip_address
left join (select engineproperties.engineid, EngineProperties.PropertyValue from EngineProperties where engineproperties.propertyname = 'Orion.Standard.Polling') e1
on engines.engineid = e1.engineid
left join (select engineproperties.engineid, EngineProperties.PropertyValue from EngineProperties where engineproperties.propertyname = 'APM.Components.Polling') e2
on engines.engineid = e2.engineid
Assuming you have SAM, you can assign the Orion Server template to each poller, and configure an alert to page you if the app monitor shows as down (which it will do if any of the Orion services shows as down).
Now what this doesn't detect is a case where the Orion services are up, but not actually working correctly. I'm working on an alert for this scenario now, where I leverage WPM to playback a transaction where I look up the node details of a specific node on a given poller, and set it to verify that we get back the expected text (WPm - text match on node name etc). Set one of these up for each poller you have.
So yes, you can monitor the pollers automatically, as long as you have paid the bucks for SAM and WPM....
If you're running NPM's newer versions, consider enabling WMI on the servers hosting the pollers, then using admin credentials to add QoE to them.
Then you'll have a more granular picture of what's actually going on inside them, and be able to alert on that.
NPM can alert on services, if I recall correctly, and if you have WMI enabled on the server, and NPM polling them.
A starting place for customizing your pollers is here:
You can create just as it creates to another node when the status is anything other than up
Now I have no way to send a picture of it.
This here can help you SolarWinds Online Help to have a more accurate idea of the solution.
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process. Learn more today by joining now.