This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Cisco 3850 Hardware Sensor Polling issues

Hello,

We have recently begun upgrading our Cisco Switches from the 3.6.x train to the 16.12.x train and have ran into hardware polling issues for these devices. What happens is that is results in the event messages just constantly filling up telling me that health monitoring for the power supply or fan is up/down. I run this same code on the Cisco 9300 line and do not have this same issue. I run NPM 12.3 until our new server is built and migrated over. Has anyone experienced this issue with their 3850's and Solarwinds running the new train IOS-XE?

I have tried the following with no results:

I have increased the SNMP poll timeout value to 5000ms
I have added/deleted the node from solarwinds
Case is open with solarwinds- progressing very slowly
Searched the web for known issues or bugs in solarwinds or Cisco

It only seems to be related to this platform and the IOS-XE train. I would be curious if this code version works fine for someone else in a later version of Solarwinds as well.

  • If it is the 2nd PS showing issue, then it could be related to Cisco Bug CSCue44402

    I had the same issue on old 3750's years ago (ios 12.2 - 12.5 at least), but have also seen a very similar issue (slightly diff msg and status display) on the 3850's. Cisco says this applies to XE 3.2.0 and 3.2.1 but some who have updated to 3.3.5 still experience the issue.

    If you are seeing PS B showing No Response for the 'Sys Pwr' this should apply. I have seen at least 1 post where a user was able to reseat the PS and the msg/status went away. No details though, if it comes back randomly or over time.

  • unfortunately its not that issue. It actually stops polling all environmental sensors or random ones. 

    3850's use sysobjectid 1.3.6.1.4.1.9.1.1745

    When we do a SNMP walk of a 3.6.8 version compared to a 16.12.3a version we see these missing:

    Cisco    cefcFanTrayOperStatus    1.3.6.1.4.1.9.9.117.1.4.1.1.1
    Cisco    cefcFRUPowerAdminStatus    1.3.6.1.4.1.9.9.117.1.1.2.1.1
    Cisco    cefcFanTrayOperStatus    1.3.6.1.4.1.9.9.117.1.4.1.1.1
    Cisco    cefcFRUPowerAdminStatus    1.3.6.1.4.1.9.9.117.1.1.2.1.1
    Cisco    cefcFanTrayStatusTable    1.3.6.1.4.1.9.9.117.1.4.1
    Cisco    cefcFRUPowerStatusTable       1.3.6.1.4.1.9.9.117.1.1.2

    When I dive into the logs of the switch there are SNMP delays and I have increased values and disabled certain features with no luck. When you look up the error there are all types of different bugs for SNMP in cisco code and they tell you to create a view and block the oid causing it. I created the view blocked one, then it blocked another and continued to do that until I just removed the filter. 

    I'm just curious if something changed in the Cisco 16.12.3a code that is causing this issue and how Solarwinds reads them. Solarwinds says its a Cisco issue, Cisco says its not their issue... 

    Is anyone running 16.12.3a on Cisco 3850's their environment and monitoring them through NPM? If so what version of NPM are you running?

  • Seeing the same behavour with IOS-XE 16.12.3a and Cisco 3850.  No issues with C9300 platform. SNMPWALKS are exceptional slow.  Increase SNMP polling timeout to the maximum which help with some nodes.  I thought it might have a stack size factor but seems to affect both small stacks and larger stacks the same.

    WF