Hi All,
We have successfully implemented Monitoring of various DB services by integrating Solarwinds and powershell.
Issues:
- Each SQL Server had different set of services that needs to be monitored.
- In terms of monitoring, no template could be created because:
- There was no service common on the servers.
- No self discovery in terms of service is possible on the Solarwinds.
- Manual selection of service for more than 100 servers would take up more than 2 weeks to be done (if the person is doing only this thing).
- For Database Servers, for each servers All services needs to be monitored.
- In case the node is removed, all the services will have to be on-boarded all again with complete manual work n checking with teams regarding the services.
- Nodes where multiple SQL instances are working in active active state, unless both the instances(services) are down, until then no alert should be triggered.
Benefits of integrate using script:
- No Need to check any of the service that which all services needs to be monitored.
- Irrespective of the number of services and servers, the script can easily detect the services.
- Keywords can be used in service name, to get the required service.
- Even if the server is mistakenly removed from solarwinds, as n when it is added back properly in Solarwinds, the service monitoring will be enabled again. (Using groups, we can on-board a group in APM, rather than individual servers).
- If the cluster node is in active active state, at that time unless both the services are down, the alert wont be triggered.
- No confirmation needed from respective teams regarding which all service needs to be monitored.
NOTE: 1. Status of the Application Monitoring will be shown as Critical always, as if we provide the status as down, we wont get the information of which service is in Down state.2. The common thing in the monitoring of services for database is that, all the service names in services.msc starts with "SQL Server (*"
#1 Below Code is for getting all services that are in down
#################Exit Codes for Service Status######################
# 0 - Up
# 1 - Down
# 2 - Warning
# 3 - Critical
# Any other Value - Unknown (Example -4)
$Error1 =0
#Try-Catch to check if Solarwinds is able to reach the Server or not
try
{
$services = Get-WmiObject -Class win32_service –credential ${CREDENTIAL} -computername $args[0] | select *; #To get all services from the server mentioned as args
$svcs = $services | Where-Object {$_.DisplayName -like "SQL Server (*" -and $_.startMode -ne "Disabled"} #To filter services starting from "SQL Server ("
if($svcs) #If values in svcs is received, will enter in loop
{
$stopped = @();
$svcs | Where-Object {$_.State -ne "Running"} | ForEach-Object{ #to get all servers that are not running
$stopped += [String]::Format("{0} ({1})", $_.Name, $_.DisplayName);
}
Write-Host "Statistic.StoppedServices: $($stopped.Length)";
Write-Host "Message.StoppedServices: $([String]::Join(", ", $stopped))";
if($stopped.Length -ne 0) # if any of the service is in stopped/stopping state
{
exit 3
}
else # all services are up
{
exit 0
}
}
else # if no service is starting with "SQL Server ("
{
Write-Host "Statistic.Error: $($Error1.Length)";
Write-Host "Message.Error: $([String]::Join(", ", "No Service Found"))";
exit 5
}
}
catch [exception] #If not able to connect to the server
{
Write-Host "Statistic.Error: $($Error1.Length)";
Write-Host "Message.Error: $([String]::Join(", ", "$_"))";
exit 5
}
#2 Below Code is for getting all services that are in the server other than which are in disabled state:
#################Exit Codes for Service Status######################
# 0 - Up
# 1 - Down
# 2 - Warning
# 3 - Critical
# Any other Value - Unknown (Example -4)
$Error1 =0
#Try-Catch to check if Solarwinds is able to reach the Server or not
try
{
$services =Get-WmiObject -Class win32_service –credential ${CREDENTIAL} -computername $args[0] | select *; #To get all services from the server mentioned as args
$svcs = $services | Where-Object {$_.DisplayName -like "SQL Server (*" -and $_.startMode -ne "Disabled"} #To filter services starting from "SQL Server (" and startuptype is not equal to Disabled.
if($svcs) #If values in svcs is received, will enter in loop
{
$running = @();
$svcs | ForEach-Object{ #to get all services
$running += [String]::Format("{0} ({1})", $_.Name, $_.DisplayName);
}
Write-Host "Statistic.RunningServices: $($running.Length)";
Write-Host "Message.RunningServices: $([String]::Join(", ", $running))";
exit 0
}
else # if no service is starting with "SQL Server ("
{
Write-Host "Statistic.Error: $($Error1.Length)";
Write-Host "Message.Error: $([String]::Join(", ", "No Service Found"))";
exit 5
}
}
catch [exception] #If not able to connect to the server
{
Write-Host "Statistic.Error: $($Error1.Length)";
Write-Host "Message.Error: $([String]::Join(", ", "$_"))";
exit 5
}
#3 Below Code is for getting all services that are in the Cluster:
1. Monitoring of clusters is a tricky task, as you have to monitor the cluster directly and not the cluster nodes:
This is because, at the time of failover the cluster's services wont change its state, whereas the cluster nodes will do change its state which will give us a false alert.
The code written in #1 can be used to monitor the cluster.
NOTE:
- You are in need of Admin access on each of the DB server.
- Required Ports needs to be opened for WMI query.
- Powershell must be installed on the DB servers.
- Groups can be used for automating the complete process of on-boarding the service Monitoring, wherein just make a group in custom property as DBname in which various values can be added like:
- SQL
- Oracle
- DB2
- Cluster-SQL - to provide only clusters
- Physical-SQL - to provide the cluster nodes which needs to be excempted from service monitoring as cluster-SQL is already getting monitored
- Categorizing the DB servers under these category can easily help you out in on-boarding the nodes in template, n as n when you provide the value to DBname, the service Monitoring will automatically be set.