21 Replies Latest reply on Nov 16, 2009 2:53 PM by bshopp

    unknown interface status

    mikemaher

      I having a hell of a time with this. I have multiple nodes that will for no reason have the network interface go into an unknown state. Sometimes a reboot fixes this but not always. I am having this issue with about 5-6 nodes on a regular basis (some stay the same others are random.) This includes my companies cluster, regualr serves and vm server boxes . Any ideas on this ?

        • Re: unknown interface status
          bleearg13

          Which version are you running?  I recall that I had a similar issue in 8.5, I believe, but it was fixed in a service pack.

            • Re: unknown interface status
              mikemaher

              i am running 9.1 SP5

                • Re: unknown interface status
                  bleearg13

                  Hmm...I doubt it's the same problem, then.  I assume the devices are all reachable during the time the interfaces become "unknown"?  Does SNMP respond during this time?

                    • Re: unknown interface status
                      mikemaher

                      yes they are pingable by their ip but if i try to list teh resources it times out.

                        • Re: unknown interface status
                          bleearg13

                          Are all the nodes that are having this problem all behind a certain device?  When they all go into "unknown" status, does it happen to all devices at the same time?  Would it be possible to use a free SNMP walk tool and install it on one of the servers to see if you can walk the SNMP tree locally? 

                          If it's not an upstream device that's stopping the SNMP communication, then it would likely be the devices themselves.  However, if they are all stopping at the same time, that would lead me to believe that the problem is with something like a firewall stopping the SNMP communication, perhaps some type of application or protocol throttling being done (IDS/IPS, perhaps?).

                            • Re: unknown interface status
                              mikemaher

                              all of the servers are in the same location (except for a few, but i never have problems with those) I don't think the server guys would appretiate me installing something since the nodes with the issue are our exchange servers and they are relaly careful with those. How ever my polling server is in the same location. I could run mib walk (we have the SW engineers toolset). One other thing is that alot of our interfaces are HP Teams, would this be causing an issue as well since their virtual.

                                • Re: unknown interface status
                                  bleearg13

                                  So are all the interfaces that go unknown this type?  I can say that I've occasionally had some strange problems getting virtual interfaces to show up on some boxes.  I have a couple of virtual servers that I can do a "List Resources" on and see the interfaces, but when polling them, they never come out of "Unknown" status.

                                  I'd try using MIB walk or the SNMP MIB Browser to poll the IF-MIB table while the problem is happening.  It doesn't sound like a problem with Orion, given the description, but you never know.

                                  • Re: unknown interface status
                                    cskowronek

                                    I am experiencing this as well.  We are running NPM 9.1 SP5 as well.  Currently we are seeing this with VM servers as well as Cisco Call Mangager servers.

                                      • Re: unknown interface status
                                        mikemaher

                                        i was seeing this as well with our vmware servers. But some of thoses were dns related and seem to be fine now. So for me it's just physical servers with he virtual teamed nics.

                                          • Re: unknown interface status
                                            bleearg13

                                            In that case, if you can pull up the Interfaces table with the SNMP MIB Viewer tool while the problem is happening, but you cannot poll or list resources with Orion, then I'd say it's an Orion issue and you'll probably want to open a ticket.  Otherwise, I'd say that it's an issue with the servers.  Good luck and please post any results.

                                              • Re: unknown interface status
                                                mikemaher

                                                well i did a quick test and well snmp seems to be timing out so it may be the servers. would a reinstallation of SNMP possibly fix this ? Also the WMI app monitors for them work fine so IDK.

                                                  • Re: unknown interface status
                                                    bleearg13

                                                    Nothing in the event logs of the server at all?  I've personally never had to re-install SNMP on a Windows box, but I have had to restart it.  Can you try just restarting the SNMP service on the server and seeing if that fixes it?

                                                      • Re: unknown interface status

                                                        in NPM 9.1 (Sp4) I am having this problem on some Clustered SQL Servers and the server guys were able to correlate an SNMP event in the system logs of the monitored node. restarting the SNMP service resolved the issue.

                                                         

                                                        I also have a 8.5 installation where I see this somewhat more often.

                                                          • Re: unknown interface status
                                                            mikemaher

                                                            they will be restarting the service next week since they (my self included) are worried that if we restart the service the node may flip. I will post the results when the service is restarted.

                                                              • Re: unknown interface status
                                                                Breclawm

                                                                Hi all,

                                                                I have been experienceing this very problem for a while with the exact same symptoms, i.e. mostly virtual machines, limited to one specific site, some HP physical servers, etc.  What I would add, to maybe help us out, is that (1) restarting the SNMP service did help me but only lasted a day or 2 and then the problem was back on the same server. (2) our SNMP settings are set with Group Policy. As such it set the registry key HKLM\software\Policies\SNMP\Parameters. I looked in the other Keys where the SNMP settings are held (HKLM\SYSTEM\CurrentControlSet\Services\SNMP\Parameters) and they were different then what was in the policies key. So I changed the later to match the policies key and that seem to work but only for a day or two.

                                                                I am as frustrated as everyone else who is having this problem but determined to find the solution. My next steps are to check any common network devices; no easy task since the problem is happening in our datacenter :(. I'll be checking back here often also.  mikemaher, I hope your service restart works. just check them again in a day or so to be sure.

                                                                  • Re: unknown interface status
                                                                    gbit18

                                                                    Hey guys,

                                                                     

                                                                    Just wondering if anyone ever found the solution to this? I've been having the same issue. And it's becoming a real pain. Both servers and network nodes go into unknown state, with the servers I was able to create a script that restarts the SNMP services, and that seems to get it back. But with the Network nodes, the only option I have is to remove it, and re-add it to NPM. However, once I remove the node, it loses all the data and starts from scratch. I've opened multiple cases with Solarwinds, and they've forward this request to their developers, but they still havent found a solution. It's been close to 1 year..... i keep upgrading to the newest SP's always praying that it fixes it  (restarting my NPM services takes most devices into an unknown state as well). Unfortunately at this point, management is chewing me up every week.

                                                                     

                                                                    PLZ say there is a fix for this.

                                                                     

                                                                    Some might be thinking why I've waited so long? I just had confidence with Solarwinds that they would figure this out for me, so now I figured I need to jump onto the forums to get this fixed, or to find alternatives.

                                                                     

                                                                    Thanks