4 Replies Latest reply on Apr 7, 2015 10:04 AM by designerfx

    How to find nodes that could report hardware status but are not?

    jbiggley

      This feels like an on-going series of questions on how to prove that we checked a box in list resources. 

       

      I'm trying to build a report that finds all physical devices in our environment, checks to see if they could be polled for hardware status and then checks to see if they are actually reporting any hardware statuses.  I have 2 of the 3 criteria via the query below:

       

      SELECT Caption, IP_Address, ObjectSubType, MachineType, Nodes.IP_Address, vm1.IPAddress, hw1.NodeID

      FROM Nodes WITH (NOLOCK)

       

      LEFT JOIN (SELECT DISTINCT IPAddress FROM VIM_VirtualMachines WITH (NOLOCK)) vm1 ON Nodes.IP_Address = vm1.IPAddress

      LEFT JOIN (SELECT DISTINCT NodeID FROM APM_HardwareInfo WITH (NOLOCK)) hw1 ON Nodes.NodeID = hw1.NodeID

       

      WHERE VM1.IPAddress IS NULL AND hw1.NodeID IS NULL

       

      The VIM_VirtualMachines table contains all of the known virtual machines in our environment.  Why the JOIN on the IP address and not the NodeID?  In order for VIM_VirtualMachines to have a valid NodeID the virtual machine has to be a managed node in NPM.  That isn't always the case.  Joining on the IP address is just easier and more clean for our purposes.

       

      The APM_HardwareInfo table contains all of the hardware statuses for all of the nodes.  In this case we are joining on the NodeID because the node has to be a managed node in order for us to collect hardware statistics.

       

      A comment about the WITH (NOLOCK) hint.  We are using that in our reporting as we are starting to leverage more and more custom SQL reporting and some of it gets a little heavy in terms of the queries that they run.  We certainly don't want to cause blocking on the DB and force core statistics gathering processes to wait because we were running a hefty report.  As the data isn't critical in reports (no alerts triggered, not small woodland creatures harmed, etc.) executing queries using WITH (NOLOCK) is fine for us.

       

      Question -- How do I find all nodes that actually respond via List Resources with an option to do hardware monitoring?  Right now my query returns all of the nodes that are not virtual machines and that are not reporting a hardware status -- even if that status is unknown or critical.  What I want to find is nodes that support hardware monitoring but do not have it enabled.  We did something similar for nodes that were not collecting CPU & Memory -- but that flag is expressed in the Pollers table.  Is the Hardware Monitoring option expressed in a similar table?

       

      Here is the query we came up with to check for nodes that could have node CPU polling (and thus CPU & Memory checked) but did not in case anyone else needs a similar query.

       

      SELECT DISTINCT Nodes.Caption, Nodes.IP_Address, Nodes.ObjectSubType

        FROM Nodes WITH (NOLOCK)

        JOIN Pollers WITH (NOLOCK) ON Pollers.NetObjectID = Nodes.NodeID

        LEFT JOIN (select DISTINCT Pollers.NetObjectID FROM Pollers WITH (NOLOCK) WHERE PollerType LIKE 'N.CPU%') c2 on Nodes.NodeID = c2.NetObjectID

        WHERE Nodes.ObjectSubType IN('SNMP','WMI') AND c2.NetObjectID IS NULL

        ORDER BY Caption