When selecting Hardware Health Sensor allow us to choose which hardware components we want to monitor.
I've just had a support case on this issue (Solarwinds, seecase #459876), where the currect hardware monitoring feature in NPM is generating a lot of false HW-failures, and it would be VERY benificial to be able to select for each node, which HW-type we want to monitor, like
I'm thinking that a good way to implement this would be for each Node to have HW-monitoring enabled for certain categories like:
- Fan
- Power-supply
- Interface/Modules
- Temperature
If the feature should be even more granular, it would be nice, if these features could be enabled/disabled per sensor, in the same way that you disable monitoring for an interface, you would be able to disable monitoring for a certain power-supply, or interface, or temp.-sensor.
Yes. Yes, Yes, Yes, a thousand times Yes.
Can you spell granularity? I knew that you could. Can you add it as a feature? Please? Let the administrators decide what is monitored, and what is not. Great feature, great capability, but let me decide what I want monitored and displayed.
This definitely needs to be an option. In the past I have always monitored power supplies, fans, etc, as undp's, but not sensors on a gbics.
Also, when a node is unmanaged, it's hardware sensors show up as undefined. This needs cleaned up. Just a parameter for status = '1'
Would anyone be interested in checking out some UX designs around this topic?
Rob, I am interested in participating.
I'd like to take a look too Rob.
Yes please!
In the hardware health alert widget I'd like to see hardware alerts for apc ups's so we can see when there is a faulty battery etc as well as the other alerts that appear like disk and fan failures
Is this topic still alive? Just ran across this and very interested and hopeful we'll soon see some granularity. Also would have like to see some UX designs, but I might be a little late chiming in.
Thanks!
Optical Transceiver levels need to put into their own category. Currently they show up under "Other".
Is this issue still being looked at? We see the same issue on multiple switches and would like to have a checkbox to remove monitoring on the misreporting sensors.
Hardware Health Monitoring has been an issue for us. NPM thinks a 10G interface that is administratively down is in alarm:
Hardware sensor Te11/7 Receive Power Sensor of hardware health monitoring on inhs-dcb-spk.netops.inhs.org is up
Have you seen the new "Manage Hardware Sensors" functionality in NPM 11.5?
Is the fix as simple as adding Interface Status not equal to Shutdown or am I missing something?