Event Log Monitor Keeps Reverting To Up

Hi all,

I am attempting to monitor Windows event logs on a group of servers for a specific event ID. I have created an application template to monitor event logs and specified the relevant Event ID's. That part is working properly. A DOWN event is generated when this reproducible event ID is created on the nodes. However after about 5-15 minutes the node reverts back to an UP state. This is what I am trying to prevent. I would like this event to function like most others, whereby the node only returns to an UP state when the condition no longer exists. I do not want to have to manually acknowledge the alert to reset to green.

So my thought is with monitoring for an event ID, I would need a "reset trigger" to again watch for a different Event ID that would indicate the service back into UP state (monitoring just the service alone is not a valid indication of availability) itself. Would I have to create a second application template (for example "Template UP" and specify the corresponding event ID/s like I did for the down application template? One template for the down state event ID and a separate template for the UP state event ID?

I want to assign this to a group of servers but do NOT want multiple application templates showing up under the All Applications widget. 

Looking for some guidance and suggestions on this one.

Thanks everyone.


Parents
  • The Event Log component monitor functions by only looking at events within a specific time frame based on your polling interval and a multiplier.

    It is detecting your event and after a period of time, that event no longer appears in that time frame therefore the component goes back up. It needs to function this way otherwise the component would be down for as long as the event appears in the event log whether it triggered 5 minutes ago or 5 days ago.

    You can see this setting when editing the Match Definition - it is the "Number of past polling intervals to search for events" and the default is 1.5. So if you have a 5 minute polling interval, it is only looking within the last 7.5 minutes of events.

    There's a few workarounds you could probably use with alerts or a second component. Eg. add a second monitor that goes warning if the Up event is found, and create an alert that triggers when the Down event component monitor goes down and resets on the Up event going warning. Or put the Up one in a second template and edit the All Applications widget to filter out that application.  So the application monitoring would collect your up/down event data but you'd rely on the alert to track the event status. 

    Not sure of exact configuration specifics but that's where I would start.

Reply
  • The Event Log component monitor functions by only looking at events within a specific time frame based on your polling interval and a multiplier.

    It is detecting your event and after a period of time, that event no longer appears in that time frame therefore the component goes back up. It needs to function this way otherwise the component would be down for as long as the event appears in the event log whether it triggered 5 minutes ago or 5 days ago.

    You can see this setting when editing the Match Definition - it is the "Number of past polling intervals to search for events" and the default is 1.5. So if you have a 5 minute polling interval, it is only looking within the last 7.5 minutes of events.

    There's a few workarounds you could probably use with alerts or a second component. Eg. add a second monitor that goes warning if the Up event is found, and create an alert that triggers when the Down event component monitor goes down and resets on the Up event going warning. Or put the Up one in a second template and edit the All Applications widget to filter out that application.  So the application monitoring would collect your up/down event data but you'd rely on the alert to track the event status. 

    Not sure of exact configuration specifics but that's where I would start.

Children
No Data