Implemented

FEATURE REQUEST - Arista Networks Hardware Health Monitoring

Product Managers,

In our network we run many Arista switches but because NPM does not currently support polling the Arista devices we are unable to easily see our overall hardware health in the Hardware Health Summary widget. Unfortunately, we had a power supply fail in an Arista switch that went undetected because of this lack of integration and led to a major outage for us.

Please add Hardware Health Support to NPM for Arista Networks hardware. I am aware of the ability to add Universal Pollers for the product but UnDP pollers do not integrate into built-in hardware health features.

Best,

Mitch Mahan

P.S. Here are the MIBs you'll need to monitor for the various hardware health statuses. I received these from my Arista SE (josh@arista.com).

Description of components:
ENTITY-MIB::entPhysicalDescr

.1.3.6.1.2.1.47.1.1.1.1.2.100711101 - Power supply fan sensor (PS1)
.1.3.6.1.2.1.47.1.1.1.1.2.100711102 - Power supply current sensor (PS1)

.1.3.6.1.2.1.47.1.1.1.1.2.100711211 - PowerSupply1 Fan 1 Sensor 1

.1.3.6.1.2.1.47.1.1.1.1.2.100721101 - Power supply sensor (PS2)
.1.3.6.1.2.1.47.1.1.1.1.2.100721102 - Power supply current sensor (PS2)
.1.3.6.1.2.1.47.1.1.1.1.2.100721211 - PowerSupply2 Fan 1 Sensor 1
.1.3.6.1.2.1.47.1.1.1.1.2.100601111 - Fan Tray 1 Fan 1 Sensor 1
.1.3.6.1.2.1.47.1.1.1.1.2.100602111 - Fan Tray 2 Fan 1 Sensor 1
.1.3.6.1.2.1.47.1.1.1.1.2.100603111 - Fan Tray 3 Fan 1 Sensor 1
.1.3.6.1.2.1.47.1.1.1.1.2.100604111 - Fan Tray 4 Fan 1 Sensor 1
.1.3.6.1.2.1.47.1.1.1.1.2.100006001 - Cpu temp sensor

Then for each component above, you can get information about sensor type, scale, value (read as 3 sig digits), and status
Here is for the CPU:
s7153#show snmp mib walk SNMPv2-SMI::mib-2.99.1.1.1 | grep 100006001
ENTITY-SENSOR-MIB::entPhySensorType[100006001] = INTEGER: celsius(8)
ENTITY-SENSOR-MIB::entPhySensorScale[100006001] = INTEGER: units(9)
ENTITY-SENSOR-MIB::entPhySensorPrecision[100006001] = INTEGER: 1
ENTITY-SENSOR-MIB::entPhySensorValue[100006001] = INTEGER: 294
ENTITY-SENSOR-MIB::entPhySensorOperStatus[100006001] = INTEGER: ok(1)
ENTITY-SENSOR-MIB::entPhySensorUnitsDisplay[100006001] = STRING: Celsius
ENTITY-SENSOR-MIB::entPhySensorValueTimeStamp[100006001] = Timeticks: (234258) 0:39:02.58
ENTITY-SENSOR-MIB::entPhySensorValueUpdateRate[100006001] = Gauge32: 5000 milliseconds

Here is for Fan 1:
s7153#show snmp mib walk SNMPv2-SMI::mib-2.99.1.1.1 | grep 100601111
ENTITY-SENSOR-MIB::entPhySensorType[100601111] = INTEGER: rpm(10)
ENTITY-SENSOR-MIB::entPhySensorScale[100601111] = INTEGER: units(9)
ENTITY-SENSOR-MIB::entPhySensorPrecision[100601111] = INTEGER: 0
ENTITY-SENSOR-MIB::entPhySensorValue[100601111] = INTEGER: 10800
ENTITY-SENSOR-MIB::entPhySensorOperStatus[100601111] = INTEGER: ok(1)
ENTITY-SENSOR-MIB::entPhySensorUnitsDisplay[100601111] = STRING: RPM
ENTITY-SENSOR-MIB::entPhySensorValueTimeStamp[100601111] = Timeticks: (318495) 0:53:04.95
ENTITY-SENSOR-MIB::entPhySensorValueUpdateRate[100601111] = Gauge32: 2000 milliseconds

from above for just CPU:

.1.3.6.1.2.1.99.1.1.1.1.100006001 - Cpu sensor type
.1.3.6.1.2.1.99.1.1.1.4.100006001 - Cpu temp value (and example value of 300 is really 30.0 degrees celsius)

.1.3.6.1.2.1.99.1.1.1.5.100006001 - Cpu status