nav[aria-label="Primary Navigation"] { padding: 0; & ul { list-style: none; width: 100%; display: flex; flex-direction: row; justify-content: start; align-items: start; gap: 30px; padding: 0; & li { margin: 0; } & ul li { list-style: none; } } }

Community
- Command Central
- MVP Program
- Monthly Mission
- Blogs
- Groups
- Events
- Media Vault
- SolarWinds Academy
Products
- Observability
- Network Management
- Application Management
- IT Security
- IT Service Management
- System Management
- Database Management
Content Exchange
- SolarWinds Platform
- Server & Application Monitor
- Database Performance Analyzer
- Server Configuration Monitor
- Network Performance Monitor
- Network Configuration Manager
- SQL Sentry
- Web Help Desk
Free Tools & Trials
Store

SNMP Failure when device is operational

FormerMember

background inforamtion: Orion 8.5.1 SP3 with APM SP3 on a Win 2003 server EE running SQL 2005 EE All servicepacks and patches are up to date

375 nodes 1300 interfaces 300 volumes

polling timming is 1min for ping 2min for interface stats and 15min for volume stats

Have done more research on this and with some sniffer captures we can see that the devices are responding to the SNMP get requests. The responses are properly formatted and look the same when the device is in the SNMP failure and after the SNMP stop/start.

The problem seems to be that ORION just stops listening to the devices. The device will report UP and due to the ping being answered but no information sent back via SNMP is accepted or recorded in the database. Some times there are no devices affect like this and others can be 1 or more. Have seen as high as 15 devices out at one time.

Of the few hundred requests/responses that were in the capture we saw very few that took over 10 thousandths of a second to respond. and the ones above this were still less than 30 thousandths of a second. Each response has the OID corresponding to the request. Compared the sequence of OID requests and responses from when it was not working and after stop/start of SNMP when ORION saw the device again. The gets and responses are identical tracking from any of the ping requests and after.

The device SNMP information is accepted again once the SNMP service on it has been stopped and started.

This puts it back in Solarwinds area to determine what is causing this problem. Becasue from the packet captures we can see that the servers are processing the SNMP requests and responding to it properly.

Multiple users have reported having this issue and it needs addressed as this is not a satifactory state of operation. I can only see this issue getting worse as we load more servers on ORION coupled with APM.

There is no specific time or any pattern that we can see and the problem affects multiple servers at different times.

Find more posts tagged with

unknown_state

SNMP

missing_data

Database

sql_2005

v8.5

server

microsoft

Accepted answers

All comments

FormerMember

forgot to put in one thing, I tried restarting the Orion server, not just the orion service and it had no effect on the problem the ones that were not responding were still not responding and no new ones were affected.

Based on the above I had initially placed this issue on our Sys Admins to find a solution but once we reviewed the captures and saw the responses comming back to the server I now have to come to you in hopes of a resolution.

tdanner

It's probably best to handle this issue through support. Have you opened a ticket yet?

branfarm

I've experienced the same issue, but mine are always a result of some kind of network interruption between the Orion poller and the device. I haven't been able to figure anything out, other than to restart Orion. I was waiting for v9 to see if it magically gets fixed.

FormerMember

Case #49679

there ya go case opened

FormerMember

ok latest on this is i put a sniffer on both ends of one os these anomolies and here is what I get

ping request

server recieves ping

server replys to ping

ping reply

snmp request

server recieves snmp request

server responds to snmp request

ICMP reply Destination Unreachable (Port Unreachable)

again this is all happening in around ones that are working and affects servers that are geographically diverse