This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Hardware monitoring of node not reporting errors

Hi,

We are testing the hardware monitoring of network devices in SW NPM, but we have an issue were the hardware simply disappears in solarwinds if we for instance unplug a powersupply, doing an SNMP walk shows the following:

18-12-2018 09:33:31 (2 ms) : SNMP V2c

18-12-2018 09:33:31 (2 ms) : Walk 1.3.6.1.4.1.9.9.13.1.5.1.3

18-12-2018 09:33:31 (6 ms) : 1.3.6.1.4.1.9.9.13.1.5.1.3.1016 = "4" [ASN_INTEGER]

18-12-2018 09:33:31 (8 ms) : 1.3.6.1.4.1.9.9.13.1.5.1.3.1017 = "1" [ASN_INTEGER]

So the switch is actually reporting correctly in SNMP, but solarwinds shows the following:

pastedImage_1.png

So the faulty power supply is no longer present, same goes for the FANS, as you can see in the screenshot one of the fans was pulled from the switch, but again the fan simply magically disappears in SW.

Any good idea's on how to fix this??

  • If you go to Edit nodes and change the Hardware health MIB, wait for a few polls, do you see values in HWH resource change?

  • That did not help, tried that as well.

    I had to complete remove the node and readd it, this however raises my concern, how many of the other 3-500 devices have the same issue?

    So eventhough the device was reporting correctly through SNMP SolarWinds simply removed hardware from the device.

    Im hoping it's a one time thing, but who knows...

  • Can't say for sure. One thing I'm certain is SolarWinds won't and can't "simply remove hardware from a device" just cause, it does not have the capability. I suggest open a support case for them to dig deeper into your issue.

  • Removing a powersupply is not a fault or failure. it's as if the device never had that fan or power supply. If the power supply fails (yank the power cord, not the entire power supply) or fail fails (stick a pen or paperclip into the fan blade to stop it from spinning) SAM's hardware health will alert you to these types of failures. It will not however, alert you if someone is stealing your power supplies. emoticons_happy.png

  • as you can see from the snmpwalk the switch still saw the powersupply in a failed state, but NPM simply stopped showing the power supply all together, same story for the fan.

    So I still dont have full trust in the hardware monitoring of our devices I must say.

    .1.3.6.1.4.1.9.9.13.1.5.1.3 ciscoEnvMonSupplyState OBJECT-TYPE

    -- FROM CISCO-ENVMON-MIB

    -- TEXTUAL CONVENTION CiscoEnvMonState

    SYNTAX Integer { normal(1), warning(2), critical(3), shutdown(4), notPresent(5) }

  • Did removing the node and re-adding it fix the Power Supply disappearing from the Hardware Sensor so it could be reported as down, as opposed to it not showing up at all, even though it is just unplugged?