This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Monitor Ping reachability in Netpath

Hello Thwack users,

I am facing issue in one of my site nodes where the devices showing frequent packet drops. To Suppress the bulk alarm implemented parent-child between switch and other devices but still getting frequent drops which creating false alarm noise sometimes. 

I want to know if anyone implemented Ping in netpath to capture the issue between source & destination. Simple question can we monitor ping trace in Solarwinds NPM netpath feature? 

Parents
  • Hi there, 

     has done a good job explaining the way that NetPath works and why it's not the right answer here. 

    Looking at your requirements from a higher level, it seems that you want a way to avoid the devices that are having regular packet drops creating too many alerts. We can do this 2 ways, through thresholds and through alert configurations.

    Thresholds

    A node only goes from 'Up' (Green) to 'Down' (Dark Red) when packet loss goes to 100%, but between 0% and 100% there can still be points at which packet loss becomes service affecting. We can use 'Warning' (Yellow) and 'Critical' (Light Red) thresholds to control when that is. 

    Go to 'Edit Node' for the device in question and scroll down to the Thresholds section, tick 'Override Orion Global Thresholds' and make changes to the thresholds for Packet Loss. Of note, the drop down near the end can allow you to choose whether the issue has to be consistent, rather than a single value (blip):

    So in order for my device to go into a warning state, SolarWinds will poll the device 5 times and 4 of those have to be above 20% packet loss.

    Alert Configurations

    Similarly, we can set a sort of delay on the alert side of things. In this example below, the alert will fire when the node reaches the 'Warning' or 'Critical' threshold:

    Underneath that however, we have the ability to set a delay that will only fire the alert if the issue (in this case, packet loss) is a consistent issue for more than a certain period of time:

    By combining these 2 settings, we can avoid a lot of spam from the alerts. 

    Kind regards,

    Marlie Fancourt | SolarWinds Pre-Sales Manager

    Prosperon Networks | SolarWinds Partner since 2006

    If this helps answer your question please mark my answer as confirmed to help other users, thank you!

  • Hi

    Thanks for your deep answering my question but alerts actually is not a big not problem for me. We are looking for solution which justify the Packet drops. During the drops i can see the packet drops on agent ports but it can't justify that packet drops are happening because of issues identify in agent netpath. It will be good if we can trace ping response path as well it will give the view to the team where exactly ping failed.

  • Hi there, 

    If the packet loss is only on the agent port, then it would definitely be worth checking that you have the 'Status and Response Time' pollers set to ICMP, rather than Agent. You can access this by going to the node and going to 'List Resources':

    This is the most reliable way to monitor status and response time and won't have as many problems with random drops. 

  • This is already enabled for availability status.

    Status & Response time is monitored through ICMP only. Just wanna check if issue happens then what is the root cause of Packet drops.

Reply Children
  • While it's a bit rudimentary, I've utilized some powershell like this in the past to catch certain problems during off-hours. Yes very ad-hoc... but taking it further you could (if you wanted) make this part of a SAM powershell monitor that could possibly add some logic around if the tracert output changes.

    ################################
    ###### Send pings and log output
    
    $sleep = 500
    $timeout = 1000
    $ipaddr = "ipaddress"
    
    Start-Transcript -Path ("$env:USERPROFILE" + "\" + "ping-host-" + "$ipaddr-" + $(get-date -uformat "%m-%d-%y") + ".log") -Append
    
    While ($true) {
    
    	$pingdata = "$((ping -n 1 -w $timeout $ipaddr | findstr "Request Reply"), $(get-date -uformat "%D %T"))"
    	
    	Write-Host($pingData)
    
    	#if (($pingdata -match "Request timed out.") -or ($pingdata -match 'time=[0-9]{4}ms')) { [console]::beep(800,100) }
    
    	Start-Sleep -Milliseconds $sleep
    
    }
    
    Stop-Transcript
    
    #################################
    ###### Trace route and log output
    
    $sleep = 10
    $timeout = 1000
    $ipaddr = "ipaddress"
    
    Start-Transcript -Path ("$env:USERPROFILE" + "\" + "tracert-host-" + "$ipaddr-" + $(get-date -uformat "%m-%d-%y") + ".log") -Append
    
    While ($true) {
    
    	$tracertData = "$(get-date -uformat "%D %T")" + ":: Starting tracert to " + $ipaddr
    
    	Write-Host(tracertData)
    
    	tracert -d -4 -w $timeout $ipaddr
    
    	Start-Sleep -Seconds $sleep
    
    }
    
    Stop-Transcript