Hi All,
I'm really struggling to set up an alert for fw failover for our checkpoint firewalls. I've located the udp and set these up, made sure we are poling them. but dont really know how to set up alerting on them - can any of you help?
What condition is it that indicates to you that there has been an HA Failover? Is there a specific value in one of those UnDP Pollers you expect to see?
to be honest, I'm kinda clutching at straws, as I'm fairly new to solarwinds, i have set basic alerts but nothing like this.
I was hoping someone would of set up an alert for checkpoint failovers and show mw a screen shot of what they have done..
I've tried the alerts bellow, but they do not seem to fire when the checkpoint fails over
OK, so you've done the hard part (in my opinion) and found the OID and created the UnDP.
What you now need to do is identify the differences between the node in service and the one that isn't. So if Node 1 (for eg) shows a result of 0 for your HA OID and the failover has a different result, and on failing over those results change then you should then be able to use that result as a part of your trigger. We do a very similar thing for Fortinets, and it took me a while to get there but this is what we do:Setup your trigger condition as below. Once you select the two dropdowns as per the image, you'll get a bit of pre-built SWQL - a la:
Then underneath is where the 'magic' happens. Take this code:
INNER JOIN(SELECTCustomPollerAssignmentID, Count(DISTINCT RawStatus) AS StatusesFROM Orion.NPM.CustomPollerStatisticswhereDatetime>Addminute(-10,GETUTCDATE())GROUP BY CustomPollerAssignmentID) AS HistoryON CustomPollerAssignmentOnNode.CustomPollerAssignmentID=History.CustomPollerAssignmentIDWHERECustomPollerAssignmentOnNode.CustomPollername='Fortigate_HA_State_Change'AND History.Statuses>1
You will need to change the 2nd WHERE line to reflect your UnDP name (see highlighted).
WHERE CustomPollerAssignmentOnNode.CustomPollername='Fortigate_HA_State_Change'
What this bit of SWQL is doing is comparing the result in the OID with the result it found the previous time (that's the AND History.Statuses>1 line). IOW, if the node has failed over, then the OID query will have a different result. As the result is different, it will trigger.
The other bit you need to know about is this line:WHERE Datetime>Addminute(-10,GETUTCDATE())
This is basically saying check back 10m for any differences. Adapt that to suit your environment. Our whole trigger condition tab looks like this:
We have the condition evaluated every minute. If a change is found it has to have been in that state for at least two minutes and occurred >=10m before.In the case of a Fortinet we check on the OID: 1.3.6.1.4.1.12356.101.3.2.1.1.4 which is also known as: fgVdEntHaState (or High Availability). It's possible states are States are: 1 = Primary - 2= Secondary - 3=Standalone.We started to do this with Checkpoints, but have by and large moved away from them. The few clients that remain with them use them in an Active/Active state so failover detection is not necessary.
brilliant !!!!! thank you so much, very well explained. I will give this a go and keep you posted.
Beaut of a post
Setup alert using OID:
1.3.6.1.4.1.2620.1.5.6 ("haState") is "Active" which checkpoint use
Configure alert like:
You can change the scop of the alert to suit Node name and custompoller = HaState etc
Then for the trigger condition use CurrentValue like below:
This will generate SQL query like: