I'm trying to set up an alert on Solarwinds. Very basic - alert me when an interface on a specific node goes down. Reset the alert when its back up.
Firstly I cant believe there isnt a template for this already. So I set the alert up using the rather disorganised GUI - which wasnt exactly a walk in the park, since it's not intuitive at all!
Here's the alert trigger condition:
And reset condition:
and the trigger action:
Reset action is pretty much the same.
This alert is enabled, but doesnt seem to trigger. Is something set up wrong?
Thanks for any help.
Solved! Go to Solution.
Thanks everyone. There are many ways to skin this cat. I opted for the marked solution, which was to trigger based on the interface ID, and set the qualifier to "not equal UP", as DOWN could appear as many different things depending on what caused the interface to drop!
Thanks - configured as suggested:
Still doesn't trigger
Logically, my original trigger should have worked, but then I dont know what manner of database lookup shenanigans is going on under the hood!
Disregard previous post - found it on the URL link to the NPM page for the interface.
Updated trigger condition to this now:
EDIT: Looks like its triggering now! Just need to figure out why the variables are coming out in plain text variable strings instead of the actual values. Example email:
Ok, I think you need to validate the conditions you need to see in order to trigger the Alert. If you go to Settings > Manage Nodes - Then find the Interface in question. Select it and you will browse to the Details Page for that specific Interface. It will show you a summary screen:
My example shows my desk port, which is currently down as I'm not in the office. Now because of this I have my Interface checked as "Unpluggable" which means my Orion doesn't show the Interface as down. Do you know if you have this also? If you don't you screen would look more like this:
Either way, your switch will still show the port as "notconnect" in sh int status.
So once you have confirmed the status you want to Alert on, jump back to the Alert and you should be good:
Let me know how you get on. If you're still struggling, post a shot of your Interface Details page (blank out any sensitive data or PM it to me).
Keep in mind that this hard coded interface ID in your alert could bite you down the line. For example, if you ever listed resources and unchecked. And later added. As I might do as a last resort if stats stopped on it and nothing else worked. A reference to a self-documenting custom property would be a little easier to maintain, or not forget that it is there. The custom property approach certainly has its overhead in terms of your time and number of properties, but might be less likely to be overlooked down the line.
Because I'm a huge fan of custom properties, I feel obligated to show you an alternative to this hard coded interface index:
Not sure that will ever be a problem, as I'd just update the alert if that happened. Besides which, we generally dont go around moving interfaces or changing the listed resources SW looks at, unless there's a need to replace the switch. In this specific example, the interface I'm wanting to report on is our primary point-to-point 1Gb dark fibre to our datacenter and the node is our core switch stack. This will NEVER be moved unless we replace the switch, hence the need for alerting.
I'm a huge fan of simple. In my experience SW is not simple and my experience today has certainly not changed this viewpoint lol. Who'd have thunk it that monitoring an interface on a network monitoring platform could be so complicated!
To be fair to them, it's one of those things where once you get the hang of it you'll be fine.
To be fair to you, different parts of the whole system have been upgraded at different rates, so some of the code is older and hasn't been revamped in a while. And as pointed out previously, it should be easier to set up and find out that down != unplugged.
Why don't you open up the interface in question and check that the Status shows as Down in the Interface Details page? If the interface is down, then there is no reason the standard "Interface Down" alert shouldn't pick it up. Did you try the alert without the Node filter and see if the Trigger list shows your interface in question? Depending on the number of down interfaces that might not be practical.
Well this is the thing, I dont care about alerting on any other interface on this node or any other node. I also dont want it to query all 1200 nodes in our network just to check if this one interface is down! I just want to get an email from SW when this specific interface on this specific node goes down. Surely cant be this hard?
The interface in question at the moment is a spare port that I'm testing on. The actual status is "notconnect" according to "show interface status". I assume (likely incorrectly) that Solarwinds translates this as "Down" when comparing alert conditions. I was testing by either unplugging the device from the port, or doing an admin shutdown on the port.
Sounds like this isnt going to be easy!
Have you considered setting a custom property (new or leverage an existing) that would match only this interface? And make that part of your condition.
This alert is already created OOTB. It is turned off by default, as from our experience users get inundated with alerts for down interfaces when they first do a discovery. There is a slider to turn these alerts on/off....go to alerts - manage alerts and search for interface. http://oriondemo.solarwinds.com/Orion/Alerts/Default.aspx
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process.