How can I stop an application alert from being triggered when a service is disabled manually?
Hello.
You could put the concerned server (node) into maintenance mode to mute alerts when you know you are going to end a service/process.
Best regards,
Steffen
No, that is not a viable option. I need to configure the alert correctly so this happens automatically. adatole any ideas?
Let's start with my second-favorite question in monitoring: how do YOU know the service was disabled manually? What I mean is, imagine this scenario:
You get a frantic call from a user, because "it" is down.
You log onto the server
you type a command or two
AHA! You see that the service has been manually disabled.
So what commands did you just type?
That will help us figure out what type of monitor to set up and use, and how to configure the alert.
The backstory - our dev team routinely needs to disable services for updates/maintenance. When they do, a notification is posted in the appropriate Slack channel so we know to expect the alerts. While this arrangement works, I have been asked to rewrite the alert so that it can recognize (and ignore) when a service is intentionally disabled vs actually down.
To build on Leon's point, we would have to see the scope of the alert and the trigger condition to better answer your question.
Lord only knows how many times I've had a incorrect trigger condition that sent off way to many false-positive alert notifications.
The reason I say this is the mute option should do the job, but if doesn't then odds are there could be bad trigger condition.
As stated, the objective is to remove the need to manually intervene. That is the reason that muting the alert will not accomplish the desired result.
Event rule to pick up the disable event log entry, then a power shell script trigger to engage the Unmanage Utility scheduling option.
But this sounds kludgy.
I don't see any way to do this efficiently, but I'm also not an MVP, lol.
If a service is down, its down. Not much else you can do.
The only logical solution I can think of is writing a PowerShell script that looks at the Windows Event Log and has enough AI to recognize if the service was stopped manually or not. Then somehow leveraging this knowledge to mute and/or prevent the alert notification from being sent out.
If this is the case, then I would have to leave your solution in the hands of PowerShell guru's.
This helps, but I am still curious - if you just walked up to the machine without any other information, could YOU tell if it was manually disabled versus actually honest-to-goodness down?
It sounds like "no", but please tell me if that's a false presumption.
That said, here's another option:
Alert #1: update custom property when service is manually disabled.
1) create custom property called "service-disabled" (a y/n field)
2) create a SAM monitor that JUST looks for the eventlog message when someone manually turns off the service in question watch these systems
this SHOULD be different than when the service crashes.
3) when THAT alert triggers, update the custom property to "yes"
4) the reset trigger is when the service comes back up. That triggers to change the custom property to "no".
NOTE: Changing a custom property is a built-in trigger action.
Alert #2: the one you care about
NOW... update the alert you already have so that the scope (not the actual trigger) includes "service-disabled" = "no"
You'll also want to create a daily report that shows all the systems where "service-disabled" = yes just so you can catch systems that are in that state for an extended period of time. But that's just basic blocking and tackling.