There have been a few occasions where some MPLS interfaces were experiencing high utilisation and the network team was asking me to provide more stats during the peak period. The default interval of 9 minutes was not working for them but we didn't want to change for all interfaces in question down to 1 minute as this would have an impact in the database.
I came up with a solution with an alert triggering when the utilization percentage is over the critical threshold which executes a PowerShell script to dynamically change the interface's polling interval as a trigger action. Once the alert is reset, the same script sets the polling interval back to the interface's previous setting. I have found this task educative around working with alerting and external scripting, so I thought it would be a nice idea to share in THWACK.
We begin by importing the alert and setting the appropriate trigger and reset conditions for our environment. I enjoy working with objects' thresholds rather than setting static values in the alerting conditions, this provides a more reliable and robust Orion platform. To keep it simple, the following conditions apply to ethernet interfaces only and triggers when either of the Tx/Rx Utilization critical threshold has been reached:
The Alert
Reset conditions are similar but we have to apply an AND boolean logic, we want both Tx/Rx Utilization to calm down as we have no idea which one triggered the alert in the first place. Give it a 5 minute 'grace' period to get a few more data points in the database:
I like to add in all of the alerts the "Log the Alert to the NetPerfMon Event Log settings" as this is an easy way to filter/report on alerts raised/cleared. The second action is the actual execution of the PowerShell script which has the following 3 arguments:
- InterfaceID (need to know which interface we're changing)
- AlertObjectID (required to append notes in the active alert)
- Interval (the polling interval to set in the Interface)
The actual arguments within the the action are as per below, with the last argument ('1') setting the polling interval to every 1 minute. Of course you can set it to whatever value you want!
C:\Windows\SysWOW64\WindowsPowerShell\v1.0\powershell.exe -Command "C:\PSScripts\ChangeInterfacePollingInterval\ChangeInterfacePollingInterval.ps1 '${N=SwisEntity;M=InterfaceID}' '${N=Alerting;M=AlertObjectID}' '1'"
The reset actions are quite similar but with different parameters for executing the PowerShell script, this time we're setting polling interval to '0'. Of course this is not possible but the script interprets it as a signal to set the polling interval back to its previous setting. This way we can utilize a single script with both actions, which makes it a little bit easier to maintain:
C:\Windows\SysWOW64\WindowsPowerShell\v1.0\powershell.exe -Command "C:\PSScripts\ChangeInterfacePollingInterval\ChangeInterfacePollingInterval.ps1 '${N=SwisEntity;M=InterfaceID}' '${N=Alerting;M=AlertObjectID}' '0'"
Starting with Orion Platform 2023.2, all new actions to execute an external program/VB script must be approved by an Orion Administrator as described here: https://documentation.solarwinds.com/en/success_center/orionplatform/content/core-approve-execute-alert-action.htm?cshid=orioncoreapprovealertaction
The Script
There are a few steps required in this script, below is a breakdown of each step and how it works. Please note that the use of unencrypted credentials within is highly discouraged, however it kept things simple during testing. There are a few methods out there to safely store credentials, a few are discussed in this post: Get-SwisData powershell script password safety
For the script to run, we need SwisPowerShell to be already installed and a user of the Orion Platform to login with Node management rights and the ability to Allow Account to Clear/Acknowledge messages.
When the trigger action is executed, the script will reduce the polling interval of the interface down to 1 minute by default and when it's reset it will set it back to its previous value. But wait, how does the script knows what was the previous value? This is where we utilize the 'notes' section of the active alert, by temporarily storing the previous value within the note! Another way would be to create a custom property, dedicated to this purpose but it felt unnecessary to create one more custom property as best practices recommend to have just the right amount of CPs, not more, not less.
Step 1: Connect to Orion and Validate connection
This is quite standard and straightforward by importing the SwisPowerShell module, initializing $swis with Connect-Swis and finally performing a simple SELECT TOP 1 from Orion.Nodes to ensure we get some data back. If not, terminate the script:
Import-Module SwisPowerShell $swis = connect-swis -host $orionhost -username $orionusername -password $orionpassword $orioncheck = Get-SwisData -SwisConnection $swis -Query 'SELECT TOP 1 NodeID FROM Orion.Nodes' if(!$orioncheck){ Write-Output "$(Get-LogOutput) Failed to connect to Orion host $orionhost" | Out-file $($LogFile) -append exit -2; }else{ Write-Output "$(Get-LogOutput) Connected to Orion host $orionhost" | Out-file $($LogFile) -append }
Step 2: Connect to Orion and Validate connection
Grab the InterfaceID and the current polling interval for the interface. The first will be used to construct the URI (Required to invoke the Set-SwisObject later) while the polling interval will be put in the alert's notes.
$Uri=Get-SwisData -SwisConnection $swis -Query 'SELECT Uri FROM Orion.NPM.Interfaces WHERE InterfaceID=@InterfaceID' @{InterfaceID=$($InterfaceID)} $CurrentInterval=Get-SwisData -SwisConnection $swis -Query 'SELECT StatCollection FROM Orion.NPM.Interfaces WHERE InterfaceID=@InterfaceID' @{InterfaceID=$($InterfaceID)}
Step 3: Determine the polling interval to set
Using the arguments passed in the script now we determine what the $IntervalSet variable's value will be. If argument $Interval is set to 0, this is the reset action and therefore need to lookup the previous value within the alert notes. If the notes have been altered for any reason, the script has intelligence to set the interval to the global default as defined in Polling Settings (Settings -> All Settings -> Polling Settings -> Default Interface Statistics Poll Interval)
If the argument $Interval is anything but 0, then it's the trigger action and $IntervalSet gets this value
if($Interval -eq 0) { $SwisResults=Get-SwisData -SwisConnection $swis -Query 'SELECT AlertNote FROM Orion.AlertObjects WHERE AlertObjectID=$($AlertObjectID)' $regexpattern = '(?<= )([0-9]+)(?= t)' $PreviousInterval=[regex]::Matches($SwisResults, $regexpattern).Value if(!$PreviousInterval){ #Something went wrong (maybe the note was changed?), set the polling interval to the global default) Write-Output "$(Get-LogOutput) PreviousInterval not found, setting polling to global default" $IntervalSet=Get-SwisData -SwisConnection $swis -Query 'SELECT CurrentValue FROM Orion.Settings WHERE SettingID=''SWNetPerfMon-Settings-Default Interface Stat Poll Interval''' }else{ $IntervalSet=$PreviousInterval } }else{ #Interval not set to 0, this is a trigger action $IntervalSet=$Interval }
Step 4: Set the new Interval
Utilizing the URI constructed in step 2, we use the Set-SwisObject cmdlet to set the new interval:
Write-Output "$(Get-LogOutput) Setting InterfaceID: $($InterfaceID) to poll every $($IntervalSet) minutes" | Out-file $($LogFile) -append $result=Set-SwisObject $swis -Uri $Uri -Properties @{ StatCollection = $($IntervalSet) }
Step 5: Active Alert modification (Ack, Alert Notes)
Finally, the last step is to edit the alert notes and Acknowledge the alert (this is useful for dashboards and widgets that output only the un-acked alerts)
Once again, making use of the $interval variable, if this is set to 0 the script only appends the note but if set to anything else it will acknowledge the alert and append a note
if($Interval -eq 0) { Write-Output "$(Get-LogOutput) Set Alert note with new interval of $($IntervalSet) minutes" | Out-file $($LogFile) -append Invoke-SwisVerb $swis Orion.AlertActive AppendNote @(@($($AlertObjectID)), "Alert reset: Changed the Interface polling back to every $($IntervalSet) minutes on $(Get-Date) ") }else{ Write-Output "$(Get-LogOutput) Acknowledging Alert with AlertObjectID: $($AlertObjectID)" | Out-file $($LogFile) -append Invoke-SwisVerb $swis Orion.AlertActive Acknowledge @(,($AlertObjectID), "" ) Invoke-SwisVerb $swis Orion.AlertActive AppendNote @(@($($AlertObjectID)), "Alert trigger: Changed the Interface polling from $($CurrentInterval) to every $($IntervalSet) minutes on $(Get-Date)") }
You can download both the alert and the script in the Content Exchange section of THWACK below:
Alert: Automatically Change Interface Polling Interval (Automation)
Script: Script to change an Interface's polling interval (Used with alerting)
I hope you find this post useful, let me know in the comments below if there are any questions or feedback to enhance its usage