2 Replies Latest reply on Feb 10, 2009 9:18 AM by cameramonkey

    Delayed trap alerts?

    cameramonkey

      I am trying to simplify my power alerting and was hoping to use traps to acheive  a streamlined power alerting system.

      Presently all of my APC UPS' are very precariously configured using email alerts to notify us of problems like power loss, bad batteries, iminent low battery shutdowns, etc.  its a pain to make sure all the configs are correctly copied to each and every unit.

      I am working towards using traps and the trap alerts to notify me of these changes, so that instead of sweating the config of each new UPS, I can just plug in the trap server and I am done. I can manage the necessary alerts on one box (orion) instead.

      The problem is I dont need to know every power hiccup that my UPS sees. I only care if its been on battery >30 sec. The UPS only sends traps for a change in status, and not repeating at an interval as far as I can tell.  I unplugged one of my UPS' and waited 2 mins, and I only received one each  on battery/off battery trap.

      Is there a way to setup a trap that if one condition is met, and another is NOT within a span of time it will alert?

      For instance, if it gets a PowerNet-MIB:apc.0.5 trap (on battery) and does NOT receive a PowerNet-MIB:apc.0.9 (off battery) trap within 30 seconds it should fire an alert  that its on battery. if it DOES receive the 09 trap within the 30 second window, it should not fire the alert. 

      My goal is to only alert a major power hit, and avoid the alerts due to dirty power that will cause a UPS to switch to battery for a second or two, often several times a day at some sites.

      Ideas?

       

       

          • Re: Delayed trap alerts?
            cameramonkey

            I considered that as well and already have that for several other things like temperature, battery status, etc.  The problem I found is that it takes a while for the polling to pick up the change.  I was hoping for a more instantaneous alert.

            Worst case scenario: Power drops immediately after a  poll, so it will take another minute or two to get back to poll it.  It is polled, so now it has picked up the change and the delay clock starts ticking. Power has now been out for 90-120 seconds, and no alert has been issued due to the trigger delay. by the time the polling engine gets back to checking it, the power has been out now for up to 4 minutes. Some UPS' are only running with enough capacity to keep the site up for 5 mins before it starts gracefully shutting stuff down.  I hoped to have more than 60 seconds lead time before getting the "servers are shutting down" warning from the UPS.

            And yeah, this isnt a problem with Orion, just a weakness in how the whole technology works.

            Thanks for the suggestion though.