1 of 1 people found this helpful
I think the main challenge you're going to have, is making SolarWinds aware of a issue with mains power at a remote site. Monitoring a router only will not give you this information, as the device will not have any information on the state of mains, be it available or not.
What you're going to need is some way to have a separate device tell you when mains power is lost. Most of the time, this would be via a SNMP trap received from some form of UPS. When the power fails, and the UPS has to start providing power, most vendors support sending SNMP traps out saying "Hey! Power has gone down!". For the rest of my post I'm going to assume you have this in place.
What you'd need to do is this:
- Create a new alert, naming it accordingly and setting the relevant properties in the 'Properties' tab.
- In trigger conditions, select I want to alert on "Custom SQL Alert".
- In the box, you need to specify the relevant SQL query to search for the SNMP trap from your remote site UPS. The table you need to point the query at is "taps". It's easier if you only have traps for power failures coming from the remote device, as you then only need to search for the IP address of the device.
- At the bottom of the trigger window, expand 'advanced conditions' and then click the box to enable them.
- Add a secondary section, selecting 'and then after' for the logic.
- In the secondary section, specify nodes as the search, and put in the specific node that will go down when the battery power of the UPS dies.
- Complete the rest of the alert definition as needed.
With this in place, you should get an alert firing off when the nodes goes down after the UPS has said it has a problem. The trigger time will be the time the power failed, and when the node comes back up (and by default the alert resets if you've left the reset conditions to be 'when trigger conditions are no longer true'), you'll have the time power is restored.
Then, all you need to do is create a report specifically looking for the trigger and reset times of this alert.
I'd love to be able to give you the exact code for the first step, but I'm a neophyte DBA myself. I hope it at least helps you get to where you need to be!
Well, @Silverbacksays, thats a wonderfull answer. It is more in line with what I was considering as my next step. wow, now I know wht to do next. Thanks a million bucks.
I've lots of sites without a UPS geographically dispersed
cisco routers report why they reloaded, so after the fact you can tell if they rebooted due to a power outage:
$ snmpwalk ...... 22.214.171.124.126.96.36.199.1.2.0
SNMPv2-SMI::enterprises.188.8.131.52.0 = STRING: "reload"
$ snmpwalk ....... 184.108.40.206.220.127.116.11.1.2.0
SNMPv2-SMI::enterprises.18.104.22.168.0 = STRING: "power-on"
you could grab that with a Universal device poller.
now after a node restores you can check
sysuptime > outage start == circuit provider
then check whyReload = 'power on' -> probably a power loss