Seems like you could setup an individual alert the MS1 .. if DOWN then use MS2 as the smtp server in the trigger actions\email section.
I too am struggling with the ability to apply alerts to individual nodes/interfaces without having to create a single alert for each one. I've used other products that have this functionality and one as robust as NPM should have it.
I don't think NPM possesses that type of logic. It simply sees that a node is down and asks if any of the conditions apply to it and then activates the trigger. It doesn't have the ability to see that a node is down and then ask if another node is Up before sending. What I've done is keep my simple alert which alerts me via SMTP1 if any node is down and a separate alert that sends an alert via SMTP2 if SMTP1 server is 'Not Up'. It's a bit silly to not have this ability in an otherwise stellar product, but perhaps I'm overlooking something.
I think you want to look at the Alert Suppression tab for the alert. This will allow you to have a condition looking at a different nodes stats and should suppress the alert if the mail sever is down.
The the example below, this alert will get suppressed if the MyMailServer is down.
Hope this helps
By setting a suppression condition, no alerts were sent at all. I did have the order of the conditions reversed from yours, but that is what I initially tried. I also tried 'Not Up' instead of Down and that didn't work. Once I removed the suppression alerts started again.
I have used this method before and worked as designed. Under Suppress Alert when ALL of the following apply. Make sure you are using ALL, not ANY otherwise it would suppress on any condition in suppress section. I am assuming your SMTP server is a node that NPM is monitoring.
As I posted in my first comment, yes, it was set to 'alert when All apply'. Maybe the order of the conditions is important? I tested using...
node status is equal to Down
node name is equal to MailServer
...and it definitely suppressed all alerts because I tested rebooting a few servers many times and got bubkis until I removed the suppression conditions. I also tried creating a group under the suppression event with no luck. Thought maybe my second mailserver server was the culprit, but it works fine testing SMTP through it.
I'll try this method again, this time with the mailserver condition before the status and see what I get. Thanks for the posts!
1 of 1 people found this helpful
I think this is getting a little more complicated than it needs to be. What if you create your primary alert for "if server is down send email through primary mail" and leave that one alone, but setup another alert that is "if primary mail server is down send mail through secondary mail". Why suppress the other alert because that mail server is down? Is there another motive/objective than just to get you an alert through another mail server when the primary is not available? (I always try to make things a simple as possible - less troubleshooting).
one moment -- are you wanting to check the status of your mail servers, or simply send an alert email through whichever mailserver is up?
if the latter case then it seems to me that this is doubling work for every alert whereas a small amount of programming in the alert manager would have fixed this.
Do you have control over your DNS? Can you try putting a new hostname in it with two address records (those of the two mail servers)
$ host appsubmit
appsubmit has address 10.64.39.203
appsubmit has address 10.64.72.169
i.e. create a new hostname (appsubmit is what we call it here) with the two address records for the two SMTP servers in. If Solarwinds have coded the application correctly it should try to connect to first one IP address, then if that fails the second IP address. There's no control over which one it tries first, but it should try all the A records until one succeeds.
So, it turns out that I somehow got sidetracked from the KISS methodology and maybe was just fatigued. I set the primary alert to send all alerts to my primary SMTP server and a second to suppress alerts to my secondary SMTP if the primary is up. This seems to be working, although SolarWinds will be sending double alerts (primary SMTP won't actually receive anything) if the primary is down. If for some reason the primary does not ping, but is still up and accepting email, we will get double alerts. That is why I would like the primary alert to be suppressed if the mail server is down.
The problem with internal DNS records (which I've tested for our mail system) is that even if the server is down, the SMTP send succeeds and the actual mail fails. Adding DNS records in round robin isn't load balancing. DNS, to my knowledge, simply sends requests to either one or the other record whether it is down or not. I've not been able to fully test our load balanced SMTP via our Netscalers, so I don't want to trust it with SolarWinds alerts just yet. Thanks, guys!