This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Report to determine which nodes are failing snmp polling

FormerMember
FormerMember

Any ideas? I've seen some folks query for the number of failures, and if they exceeded 5 consider that a failure? Anyone know which table that information is stored in, currently using NPM 11.5.2

  • ‌Hello kbryk

    This is the query I use. It checks the database for nodes being polled by SNMP then it checks those nodes to see if they have sent any cpu updates in the last fifteen minutes, if not its considered failed.  You can simply paste this into a report and scedule it to run on a regular basis. Also if you change the LIKE 'SNMP' with WMI it will check those nodes as well.

    SELECT n.Caption,

      n.IP_Address,

      DATEDIFF(mi,MAX(c.DateTime),GETDATE()) minutes_since

    FROM Nodes n

    INNER JOIN CPULoad c ON n.NodeID = c.NodeID

    WHERE n.Status = 1 AND n.ObjectSubType LIKE 'SNMP'

    GROUP BY n.Caption, n.IP_Address

    HAVING DATEDIFF(mi,MAX(c.DateTime),GETDATE()) > 15

    ORDER BY minutes_since DESC

  • It would be cool if this was native functionality.


  • I would use the below logic to exclude the "Unmanaged " nodes as the SNMP polling will be ignored for nodes marked as UnManaged.


    You can also use the same SQL Query to get Alerted when the Node will not be updated for SNMP Polling.


    alert.PNG


    //////////////////////////////////////////////////////////////

    SELECT * FROM Nodes

    WHERE ObjectSubType IN ('SNMP', 'WMI')

    AND UnManaged = 0

    AND Status = 1

    AND DATEDIFF(mi, LastSystemUpTimePollUTC, GETUTCDATE()) > 25



    //////////////////////////////////////////////////////////////////////////////////

    SELECT Nodes.NodeID AS NetObjectID,Nodes.Caption AS Name

    From Nodes

    WHERE

    (

      Nodes.Status= '1' AND

      ObjectSubType NOT LIKE 'ICMP' AND

      Vendor NOT LIKE 'Unknown' AND

    (

    DATEDIFF(mi,Nodes.LastSystemUpTimePollUtc,getutcdate()) > 5

    OR

    LastSystemUPTimePollUtc IS NULL

    )

    )

    /////////////////////////////////////////////////////////////////////

    SELECT Nodes.NodeID AS NetObjectID,Nodes.Caption AS Name, Nodes.Vendor AS Vendor

    From Nodes

    WHERE

    (

    (DATEDIFF(mi,Nodes.LastSystemUpTimePollUtc,getutcdate()) > 1) AND

      Nodes.Status= '1' AND

      Vendor NOT LIKE 'Unknown'

    )

    OR

    (

    Nodes.Status = '1' AND

    Vendor NOT LIKE 'Unknown' AND

    LastSystemUPTimePollUtc IS NULL

    )

    //////////////////////////////////////////////////////////////////






  • FormerMember
    0 FormerMember in reply to jstinson1

    Thanks, have you considered instances where the OID is not polling correctly, or has not polled at all?

  • FormerMember
    0 FormerMember in reply to FormerMember

    This was directed @jstinson@l1s

  • FormerMember
    0 FormerMember in reply to GoldTipu

    Awesome, thanks Malik.

  • FormerMember
    0 FormerMember

    Kicked myself with this one since it is so basic, but here is what I have used, although not entirely accurate, it has given me a more complete list, regardless of the false positives. In my eyes if the machine type is not getting polled SNMP is probably failing since most of our devices are common enough to have this OID recognized.

    Select Nodes.Caption, Nodes.IP_Address

    from Nodes

    where Nodes.MachineType = 'Unknown' and Nodes.ObjectSubType = 'SNMP'

  • I have many nodes that have historically worked, and then stop.  Their MachineType would not be unknown when SNMP fails since the database already got populated with something upon initially adding.  Conversely, I have some devices from various vendors where SNMP works fine, but the 'test' option on Node Details page fails and the MachineType has always shown unknown, and SNMP has always worked.  So that query may give a false sense of assurance and give some false positives at the same time.

  • FormerMember
    0 FormerMember in reply to d09h

    Agreed, looks like a little modification of Malik's response may work better for you. I haven't dug in to figure out what the lastsystemuptimepoll value exactly captures, guessing last time the snmp manager was able to poll the agent, but this doesn't consider whether it has failed or not?

  • Would still prefer SolarWinds build that functionality natively.  Never understood why that isn't already part of the product (via web, not some SWQL or SQL that we cobble together and hope is right).