cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

UPS Standarization & Alerting

UPS Standarization & Alerting

It seems that all the different UPS manufacturers have a mess of MIB OIDs that we have to wade through.  It would be great if there was an integrated monitor that standardized all those OIDs into a single alert-able field.  Maybe we could make something like that work its seeming to take a lot of extra custom pollers and different names that make it more difficult to alert on that I would like.

My thought is we really only care about a few fields like (On Battery), (Running Self-Test), (Overall Load) and (Remaining Run-time).  Extra would include phase input & output, voltage per phase (if its a 3-phase UPS).

It would be nice to have a standardized page just for UPS monitoring that could display the proper bars, graphs ect.  If the page could monitor for failed batteries that would just be icing on the cake.  Really its like taking the hardware monitor and making it work for all UPS devices.

Really I just need to know when a UPS goes onto battery and a critical alert when the batteries are below some percentage say 50% or remaining run-time is below X minuets.

8 Comments
Level 13

I totally agree with you JustinY, this would be awesome. I would add battery lifetime management. Some ups will provide the last time the battery was changed so we can plan batteries replacement. I would also make sure this nice feature would be adapted for PDUs.

Level 11

I had to setup three alerts and each have two groups of conditions because each UPS vendor has different OIDs and each poller needs a unique name.  So I cannot map unique OIDs to a common name that I know of.  Maybe that's other feature request so we can make "generic" pollers with conditions for particular vendors.  Easiest way would provide a list of known OIDs that value exists on but then we have to deal with differences in the raw data and how to standardize it.  APC uses ticks for the runtime remaining and powerware uses seconds so in my alert I had to do some math on the alert threshold values.

Alert:

Alert me when a UPS goes (On Battery)

Triggers:

Eaton PWxupsBatteryAbmStatus = batteryDischarging(2)

APC upsBasicOutputStatus = onBattery(3)

Message:

The UPS ${NodeName} is now (On Batteries).

Alert:

Alert me when a UPS has less than 50% (Battery Remaining)

Triggers:

Eaton PWxupsBatCapacity         = % of remaining batteries.

APC upsAdvBatteryCapacity   = % of remaining batteries.

Message:

The UPS ${NodeName} has less than 50% (Battery Remaining)

Alert:

Alert me when a UPS has less than 20 min (Remaining Run-time)

Triggers:

Eaton PWxupsBatTimeRemaining = Time in seconds of remaining runtime.

APC upsAdvBatteryRunTimeRemaining = Time in tics of remaining runtime(6000 ticks 1 min).

Message:

The UPS ${NodeName} has less than 20 min (Remaining Run-time)

one other thing needed.... Warning message (when device goes to battery) needs to wait 5 to 10 minutes before being sent.  Otherwise you could get a lot of invalid messages (noise). (ie..battery self-test where it goes to battery for one to two minutes and then goes back to normal).

Level 11

I found that the Hardware Sensors built into Cisco and other servers are very handy to use a temperature probes.  It sure would be nice if the UPS hardware/temperature/humidity sensors could be used as easily.

Level 11

So apparently this has been implemented.  I noticed that our APC gear now has a resource option of UPS.

APC UPS Monitoring with ORION PLATFORM 2018.2

It does not work with our Eaton but maybe sometime in the future?

jreves cobrien

Level 15

Can you send me an SNMP walk using our tool?: Success Center 

I'll get a ticket created so we can track it.

Level 11

Sure thing.  I uploaded a content file shared with you.

Level 9

I have recently put in a feature request asking for a similar thing, based upon the APC SAM app that is referenced above.  However it is currently only for APC, when it should be fairly easy to allow a pulldown selection of the UPS OEM that would essentially change the root OID/MIB and MIB calls related to each gauge or panel shown in the App. Please see and vote for this feature request to be implemented.

https://thwack.solarwinds.com/ideas/11279