Pairing Monitoring and PowerShell Alert actions for auto remediation of issue(s)

Occasionally we run into situations where we need to restart a service to deal with a recurring issue.  Given that the condition is clearly definable and the solution is always the same an automated remediation solution reduces the work required by the responsible party (in this case, me).

Example case:

Problem: I have found that occasionally my windows agents stop communicating with the polling server even though the service is running.

Solution: Restart the agent service

Discussion:

The service monitor using the alert action to restart a service will not work since the service is still actually running so some other solution is needed which provides a more generalized solution.

Solution:

I wrote a PowerShell script to restart the agent service and set it up as an alert action triggered by the agent communication status in selected states(Unknown, Unable to Connect) and logs the actions for review.  (See below)

This approach could be paired with any alert triggers to include monitoring based on PowerShell including PowerShell based monitoring.

PowerShell script:

Note: My initial alert was paired with email notification and I needed to alert on all agents not just Windows so I pushed the OS check into the script rather than filtering at the trigger condition.  I also pass into the script from the alert action the target server name and the OS family.

<<<<script start>>>>

param(
   $svrName,
   $osFamily
)

#AlertActionTest.ps1
$outFile = "<PATH TO LOG FILE>"

$svcName = "SolarWindsAgent64"
$time = Get-Date

if ($osFamily -eq "Windows"){
   $results = Invoke-Command -ComputerName $svrName -ArgumentList $svcName -ScriptBlock {
      param(
         $serviceName
      )
      Restart-Service -Name $serviceName -Force
      $status = get-service -Name $serviceName
      return $status
   }

   $message = "$time : Attempting to restart Solarwinds agent on: $svrName as $env:USERNAME : Final status $($results.status) "
}else{
   $message = "$time : Unable to restart agent  on: $svrName since server is not a Windows machine"
}
Out-File -FilePath $outfile -Append -InputObject $message

exit 0

<<<</script end>>>>

Alert action setup:

Alert Action: Execute an external program

Network path to external program: C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -ExecutionPolicy Unrestricted -NoProfile  -File "<PATH TO SCRIPT on POLLING SERVER" -svrName  ${NodeName} -osFamily ${N=SwisEntity;M=Vendor}

Windows Authentication:  Since this script uses "invoke-command" on the target server you will probably need to explicitly provide credentials with sufficient permission to execute the script.

Hope this helps someone.