Does anyone know of a way to override the default behavior of APMServiceControl. It seems that if the node is being monitored with an agent, APMServiceControl will default to scheduling alert trigger actions via agent communication. This is fine in most instances, however I'm attempting to create a fail safe action for agent problems.
We have a large number of agents and each day a couple of them will stop communicating with the polling engine even though the service is running and no error conditions are logged. A simple service restart always fixes the issue. I've created a SAM template and alert that checks that the service is running via WMI polling, and compares that to the Agent status of "unknown".
The last step is to trigger a service restart for the Agent. When I create a test condition to stop the service and the agent is in "up" status, it works fine. However when the service is down (hence the agent is in "unknown" status), the trigger action fails. The APMServiceContol logs the following event:
2018-11-16 12:07:39,320 [1] ERROR APMServiceControl.ServiceControl - Action service failed. Unable to schedule job on agent node 1 - agent is not connected
The documentation seems to indicate that APMServiceControl defaults to communicating using the method configured for polling the node by NPM. I would like to override this so that it will communicate directly with the target Windows instance using WMI or RPC.
Has anyone found a configuration option that can override this default?