This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Help on Alert for an "Active/Passive" Pair of Servers

I'm pretty darn good with NPM and SAM.  I've even gotten really (really) creative over the years is running 64-bit PowerShell scripts by wrapping them in a 32-bit vbScript.  Needless to say, I'm pretty well versed.  However, I've realized that I've hit a conundrum in configuring either the SAM Template or the Advanced Alerting Engine.

Here's what I've got:  I've got two servers - call them EAST01 and WEST01.  They are in opposite data centers and they run the same software.  The software is made up of a set of services.  Some of those services (4) are in an Automatic state and I monitor them all the time.  The other services (3) are in a Manual state and should ONLY be running on one or the other server at any time.  I've written a fairly simple PowerShell SAM Script to watch that this is running on one or the other.  I'm including the script at the bottom.

This isn't a hard set of things to monitor, but I'd like to alert on when the returned value is "0" (service isn't running on either endpoint) and when there is a change (Server1 takes over for Server2 or vice-versa.  To compount things, I've built a pretty decent custom view in Orion to show this stuff and I want to be able to "see" that these things are in place.

Yeah - it's more than my simple "here's a script to do..." type of posting, but I'm asking for some assistance.

Script Summary Follows:

$Server1Service = Get-Service -ComputerName $Server1IP -DisplayName $ServiceName
$Server2Service = Get-Service -ComputerName $Server2IP -DisplayName $ServiceName

$Server1OK = $Server1Service.Status -eq "Running"
$Server2OK = $Server2Service.Status -eq "Running"

$Server1Stat = $Server2Stat = 0
if ( $Server1OK )
{
$Server1Stat = 1
}
if ( $Server2OK )
{
$Server2Stat = 1
}

if ( $Server1OK -or $Server2OK )
{

Write-Host ( "Message.Summary: " + $ServiceName + " on " + $Server1Name + "/" + $Server2Name + " Pair is Online" )
Write-Host ( "Statistic.Summary: 1" )
Write-Host ( "Message.Node1: " + $ServiceName + " is " + $Server1Service.Status + " on " + $Server1Name )
Write-Host ( "Statistic.Node1: " + $Server1Stat )
Write-Host ( "Message.Node2: " + $ServiceName + " is " + $Server2Service.Status + " on " + $Server2Name )
Write-Host ( "Statistic.Node2: " + $Server2Stat )
$ExitCode = 0
}
else
{
Write-Host ( "Message.Summary: " + $ServiceName + " on " + $Server1Name + "/" + $Server2Name + " Pair is Down" )
Write-Host ( "Message.Statistic: 0" )
$ExitCode = 1
}
Exit ($ExitCode)

  • It looks like you're writing out different values whenever something interesting happens and you're changing the exit code based on a general pass/fail. Is the issue creating the alert definition then? You provided a lot of good information but I'm not seeing where you're having a problem.

  • Hi

    i guess your problem is that you get a ok, if one or both services are running.

    Thats the normal -or behavior

    to get a $true only when only one side of the comparison is true you need to use the  -xor comparison operator.

    1 -xor 1 = 0

    1 -xor 0 = 1

    0 -xor 0 = 0

    If have understand you right ( i couldn´t find a specifc question in your post) that is your solution.   hmm its like 42  :-)

    Regards

  • This is a very common issue that is hard to wrap your head around with SolarWinds because agent-based solutions deal with it so differently.

    Here's the skinny: For any active-passive cluster, you are going to monitor 3 servers:

    1. member server 1
    2. member server 2
    3. virtual/cluster/fake server A

    On the two real member servers, you are going to monitor only that which does not move. the C: and D: drives, CPU, RAM, up/down, interfaces, etc.

    On the "pretend" server (the one that represents whichever real server is running the services you want at that moment) you monitor everything that DOES move from active to passive in a failover situation. That would include:

    • Services like SQLServer, Exchange, Sharepoint, Apache, etc.
    • Virtual mount points
    • Log files
    • ...etc...

    There is some duplication: In the case that both member servers die, you *will* get 3 tickets instead of two for example. But if you are smart about it you can minimize this overlap significantly. Then you don't need to do any brain-constipating and-or-ing logic in code across two servers (not to mention space and time).