cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Trigger Actions randomly not firing.

I have a trap alert and reset trigger to fire SNMP Traps on all alerts and clears.

I am starting to see issues where traps are not being sent.  If I look in the database, I see that the reset is being logged to the database, but randomly it seems the reset trigger is not followed by the trap being sent.  Ant thoughts on why Orion would trigger the reset but not attempt the reset action?

0 Kudos
8 Replies

This is what I found out about from support.  Apparently, if a node goes into unmanage, it ignores any reset trigger actions.  I would say this is a serious design flaw, but since it does not appear to have created issues for anyone else, maybe not.

So to summarize, the conversation, even if you have a reset trigger that says to reset when node is unmanaged, there is a hard-coded override that resets the trigger and ignores the alert actions for a reset.  I have submitted a feature request to correct this.

0 Kudos

See my post here when I experienced the same issue...

Clarification of alert behaviour

Hello

When node is going unmanage alert engine stop executing action on that node.

Can be turned off:

Settings-> Polling settings -> And check this:

Allow alert actions for unmanaged objects Alerting Engine will execute actions for network objects in unmanaged state.

After that if someone unmanage something Alert engine will trigger/reset as usually.

0 Kudos

The way I read that, it will not only enable the reset trigger to work, but it will allow the alert trigger to also apply to an unmanaged to alert out.

0 Kudos

Yes, you are right. Probably you can check this options and in trigger condition add do not trigger if node is in unamanage status. (node status is not equial unamanage).

Level 16

I'd turn up logging on the Alert manager process and read the logfiles from it to see what the process is doing.

three ideas:

idea 1:

It might be as straightforward as the process gets the list of nodes that the reset condition matches it then

   resets the alert state on those nodes.

   then subtracts the sets of nodes where the alert was not active (which may also include the nodes that were unmanaged) and the list of nodes that were unmanaged.

   then it fires the alert actions

(but from what you have shown in the database this does not seem likely... but it might be that the sending is suppressed after the action is logged giving idea 2)

Idea 2:

so we might be getting:

   resets the alert state on those nodes.

   then subtracts the sets of nodes where the alert was not active

   then it fires the alert actions

   the action checks the unmanage state on the object alerting and returns without executing the action

Idea 3:

SNMP trap is unacknowledged: just because a trap is not received does not mean that it was not sent.

0 Kudos

I would think, if it was sent, NPM would write it to the database.  I am noticing, it never fails to send a trap on a trigger, but it does on a reset.  The rules are not missing the reset, so I know the rules are good.  I have noticed when it does not log sending a trap, it also does not log NPMEventLog action either.

0 Kudos

Capture.JPG

The node went down at 8:20.  At 8:22, someone put the device into unmanage.  As you can see at 8:23, Orion saw the alert reset, but did not send the trap.

0 Kudos