Any ideas? I've seen some folks query for the number of failures, and if they exceeded 5 consider that a failure? Anyone know which table that information is stored in, currently using NPM 11.5.2
Hello kbryk
This is the query I use. It checks the database for nodes being polled by SNMP then it checks those nodes to see if they have sent any cpu updates in the last fifteen minutes, if not its considered failed. You can simply paste this into a report and scedule it to run on a regular basis. Also if you change the LIKE 'SNMP' with WMI it will check those nodes as well.
SELECT n.Caption,
n.IP_Address,
DATEDIFF(mi,MAX(c.DateTime),GETDATE()) minutes_since
FROM Nodes n
INNER JOIN CPULoad c ON n.NodeID = c.NodeID
WHERE n.Status = 1 AND n.ObjectSubType LIKE 'SNMP'
GROUP BY n.Caption, n.IP_Address
HAVING DATEDIFF(mi,MAX(c.DateTime),GETDATE()) > 15
ORDER BY minutes_since DESC
It would be cool if this was native functionality.
I would use the below logic to exclude the "Unmanaged " nodes as the SNMP polling will be ignored for nodes marked as UnManaged.
You can also use the same SQL Query to get Alerted when the Node will not be updated for SNMP Polling.
//////////////////////////////////////////////////////////////
SELECT * FROM Nodes
WHERE ObjectSubType IN ('SNMP', 'WMI')
AND UnManaged = 0
AND Status = 1
AND DATEDIFF(mi, LastSystemUpTimePollUTC, GETUTCDATE()) > 25
//////////////////////////////////////////////////////////////////////////////////
SELECT Nodes.NodeID AS NetObjectID,Nodes.Caption AS Name
From Nodes
WHERE
(
Nodes.Status= '1' AND
ObjectSubType NOT LIKE 'ICMP' AND
Vendor NOT LIKE 'Unknown' AND
DATEDIFF(mi,Nodes.LastSystemUpTimePollUtc,getutcdate()) > 5
OR
LastSystemUPTimePollUtc IS NULL
)
/////////////////////////////////////////////////////////////////////
SELECT Nodes.NodeID AS NetObjectID,Nodes.Caption AS Name, Nodes.Vendor AS Vendor
(DATEDIFF(mi,Nodes.LastSystemUpTimePollUtc,getutcdate()) > 1) AND
Vendor NOT LIKE 'Unknown'
Nodes.Status = '1' AND
//////////////////////////////////////////////////////////////////
Thanks, have you considered instances where the OID is not polling correctly, or has not polled at all?
jstinson@l1s
Awesome, thanks Malik.
Kicked myself with this one since it is so basic, but here is what I have used, although not entirely accurate, it has given me a more complete list, regardless of the false positives. In my eyes if the machine type is not getting polled SNMP is probably failing since most of our devices are common enough to have this OID recognized.
Select Nodes.Caption, Nodes.IP_Addressfrom Nodeswhere Nodes.MachineType = 'Unknown' and Nodes.ObjectSubType = 'SNMP'
Select Nodes.Caption, Nodes.IP_Address
from Nodes
where Nodes.MachineType = 'Unknown' and Nodes.ObjectSubType = 'SNMP'
I have many nodes that have historically worked, and then stop. Their MachineType would not be unknown when SNMP fails since the database already got populated with something upon initially adding. Conversely, I have some devices from various vendors where SNMP works fine, but the 'test' option on Node Details page fails and the MachineType has always shown unknown, and SNMP has always worked. So that query may give a false sense of assurance and give some false positives at the same time.
Agreed, looks like a little modification of Malik's response may work better for you. I haven't dug in to figure out what the lastsystemuptimepoll value exactly captures, guessing last time the snmp manager was able to poll the agent, but this doesn't consider whether it has failed or not?
Would still prefer SolarWinds build that functionality natively. Never understood why that isn't already part of the product (via web, not some SWQL or SQL that we cobble together and hope is right).
Does anyone knows on how to create this kind of report? I also need this to check all our devices that are failing the polling method ( WMI and SNMP) configured