14 Replies Latest reply: Apr 10, 2012 10:02 AM by cmgurley RSS

Numerous devices interface status as Unknown

dpeterson87

Hello all,

 

I am having an issue where multiple interfaces are reporting as unknown but I know for a fact that they are up and functioning. Is there someway to stop this from happening, it seems to be random and I remember a post awhile back about a workaround but can not find it.

 

Any help would be greatly appreciated.

 

Thank you,

 

Dustin

 
  • Re: Numerous devices interface status as Unknown
    mgibson

    I too have had this issue and have had several cases opened, I refer to it as a rolling SNMP blackout. Because my Primary poller is monitoring over 8000 elements, I am told by support that until we get this number below the 8000 mark we can not continue to troubleshoot. We originally thought we corrected the issue when I was running v10.2.1 and rolled back a couple of .dll files to v10.0 versions. Seemed to work until I upgraded to v10.2.2 then it began again. Support now tells me that there is no need to roll the .dll files back to v10.0. I am completely shocked as to why this can not be solved, as I am evidently not the only one having this issue. Seems to effect SNMPv3 devices. If anyone finds a solution to this issue, please post.

  • Re: Numerous devices interface status as Unknown
    cmgurley

    Hey Dustin,

     

    Can you give us a little more info? How many nodes are you polling? If a ton (like most of the responders), then check their advice and up your pollers (or reduce your nodes/ints). However, if you don't have all that many and you're still seeing this behavior, it would be helpful to know the types of devices. For example, Windows Server 2003's SNMP service has a habit of not retrying after failures and leaving its SNMP objects in that unknown state. We've also seen the behavior on our NET voice gateway. Restarting the SNMP services on those seems to fix it (temporarily).

     

    If your issue is resolves, we'd love to hear what it is (or have you mark one of the other guys as the correct answer). Thanks!

     

    Chris Gurley, MCSE, CCNA, MBA

    bcTechNet | Awesomeness lives here...

    VMan 4.2.1, Orion Core 2011.2.1, NCM 7.0, NPM 10.2.2, UDT 2.0.0, IVIM 1.2.0

    • Re: Numerous devices interface status as Unknown
      dpeterson87

      Hi Chris,

       

      We only have 1186 elements we are monitoring. All the devices we are monitoring are Cisco devices. There are switches, and firewalls that are reporting interfaces as unknown but I know they are up and they were up previously. There doesn't seem to be a pattern as to why this happens, it just does.

       

      Because we only have 1186 elements we only have one poller and I believe that should be enough. The only way I have been able to fix this is to go to the device with unknown interfaces and change the snmp version to snmpv1 and click submit. Then I go back into the interface and switch it back to SNMPv2 which we are using. Then I have to restart all the Solarwinds NPM services and sometimes the interfaces come back as up and sometimes they stay as unknown.

       

      Any help with fixing this would be appreciated.

       

      Thank you,

       

      -Dustin

      • Re: Numerous devices interface status as Unknown
        cmgurley

        Dustin,

         

        Sounds like your polling capacity is probably fine, but just to make sure, you should check it out:

        • In Orion, click "Settings" in the top right
        • Then, in the "Details" section on the right, click "Polling Engines"
        • The second to the bottom line shows "Polling Rate"

        Mine's at 17%, but that probably varies on the underlying hardware. Go ahead and post that back when you reply.

         

        What version of IOS or CatOS are you running on some of those switches that are exhibiting the fickle polling behavior? You're probably up to date, but I was searching online and saw that back in 2006, 11.2, for example used to have SNMPv2 support but then Cisco pulled it due to inconsistent results (link). I know; it's a long shot, but have to ask.

         

        Also, what version of NPM are you running?

         

        Thanks.

        --Chris

        • Re: Numerous devices interface status as Unknown
          dpeterson87

          Chris,

           

          I am looking at solarwinds and the second to the bottom line shows "SNMP statistics Polls per second" and that number changes constantly.

          Was that the number your looking for? If so it jumps between 1 and 10.

           

          Also we are using version 12 and above for IOS on all our switches.

           

          We are running Version 10.1.3 of NPM.

           

          Thank again for you help,

           

          -Dustin

          • Re: Numerous devices interface status as Unknown
            dpatzold1979

            I had to upgrade to 10.2.2 from 10.1.3 to fix a polling collection issue. You may want to look into upgrading. I also had in the past interfaces/nodes that would show up unknown. I had to open a support case and upload the diagnostics but it ended up being a dhcp computers that we were monitoring via icmp only that was getting the system stuck. so you may need to open a support case or remove any devices that may stop things.

          • Re: Numerous devices interface status as Unknown
            cmgurley

            Dustin,

             

            Perhaps it's a new feature with 10.2.2 (see the second-to-bottom row in the image below)...

            polling_engines.png

            See the next post (dpatzold1979). Looks like 10.2.2 might be your answer.

             

            ~Chris

          • Re: Numerous devices interface status as Unknown
            zizi

            Polling Rate engine metric was introduced in Orion NPM 10.2, there was no such metric in NPM 10.1.3. Orion 10.1.3 was counting SNMP queries used for collecting statistics (interfaces counters, CPU & Memory) and exposing this number as "SNMP statistics Polls per second". 10.2 is using completely different metrics for reporting polling performance. You can take a look on this blog post for more details.

             

            Let me provide you a quick summary about unknown interfaces. There are a few possible "sources" of unknown interfaces:

            • Interface remap issue
            • Broken SNMP communication to devices
            • Frequent SNMP timeouts

             

            Restart of device can assign different indexes to interfaces, but Orion is designed to handle this. Problem occurs when both index and name (description) are changed. When Orion manages an interface which is no longer present on device, it stays in unknown state, as Orion is unable to remap it. In Orion 10.1.3 polling and remapping are mutually excluding operations, so there is no polling performed during remap and interface remains in unknown state.

             

            Sometimes restart of Orion services helps fixing the issue. Orion 10.1.3 is using one single source UDP port for all SNMP requests (10.2 uses a set of source ports). It appears that reboot of network device which interconnects Orion to polled devices (firewall, router) may cause broken SNMP connection, because all requests from UDP port used by Orion start being discarded. Once Orion services are restarted, they start using different port and issue disappears.

             

            If there are frequent SNMP timeouts on the network, Orion might not be able to complete polls and mark interfaces as unknown. In 10.2 timeouts could also cause interface remapping issues, as Orion could be using incomplete results for remapping (this issue was addressed in upcoming 10.3 release). In NPM 10.2.1 there was a problem with SNMPv3, which caused deadlocks or unnecessary timeouts in SNMP library (that's why some of you were asked to replace SolarWinds.Net.SNMP.dll with version from 10.2). This issue was fixed in 10.2.2.