12 Replies Latest reply on Dec 9, 2010 12:01 PM by warbird

    "Interface Bandwidth" sometimes reports incorrect size??

    warbird

      Sometimes, the "Interface Bandwidth" in the Interface Details page will show up with a size of 10.0 Mbps, when it should be reporting 100 or 1000 Mbps.  Anyone know why this happens??  My pollers are well within load limits.

      I am running NPM 9.5.1 on all pollers.  Ticket 146076 just opened regarding this.  This problem occurs intermittently, not always on the same node, and not always on the same interface.  I am uncertain if it always occurs on nodes being monitored by the same polling engine.

      I just purchased and brought a 4th polling engine online (upgrading the primary and 2 secondaries from 9.5.0 to 9.5.1 at the same time).  I have ~18,000 total elements monitored with more being added, mainly spread across the 3 secondary pollers, with the primary poller having next to no load, as I plan to turn up NTA again soon.  I noticed this "Interface Bandwidth" issue both before and after bringing the 4th poller online to reduce load.

      I would appreciate suggestions, answers, help.

        • Re: "Interface Bandwidth" sometimes reports incorrect size??
          warbird

          I should add that when this problem occurs, it skews my graphs and messes up my "Top 10" type of reports, wrongly showing an interface as flooded.  This is a huge problem for us.

          • Re: "Interface Bandwidth" sometimes reports incorrect size??

            This happens to us all the time as well. It's quite annoying and would like it fixed.

              • Re: "Interface Bandwidth" sometimes reports incorrect size??
                warbird

                Quick update:  I am working with support to track down the issue.  I have cleaned up a few issues with the db (possibly unrelated), and they are evaluating a new set of diagnostics.

                sraiwt, how many polling engines are you running?  What is your environment like?  It seems like the problem is a bit more noticeable since I brought on the 4th polling engine.  Not positive, though.

                I will keep you posted here on our progress.

                  • Re: "Interface Bandwidth" sometimes reports incorrect size??

                    We have a rather small environment compared to alot of the users I see on here.  We're currently using one single box for everything (DB, Poller, Web) and we're running APM and the wireless plugins.  We have approx 11k elements as well in our environment.

                      • Re: "Interface Bandwidth" sometimes reports incorrect size??
                        warbird

                        In my experience, that is a lot to have on a single box.  Glad it is working for you.  You must have polling intervals turned down?  Our polling engines start choking around 7k elements.  I also have our db on a standalone server with the actual db backended on an extremely beefy SAN (we had run into issues with disk queue length problems).

                        The issue reported in this thread is still occurring this morning.  So the db cleanup I did yesterday did not work.

                          • Re: "Interface Bandwidth" sometimes reports incorrect size??
                            warbird

                            Actually, I am not positive the issue still remains.  Sorry.  I am still researching it on my own, while support does their stuff.  Regardless, will let you know what we find out.

                              • Re: "Interface Bandwidth" sometimes reports incorrect size??

                                Yes our polling interval is at 5 min for most devices.  Critical devices and devices of interest we increase to 1 min as needed.  The box this is running on is also quite beefy.  It's a Windows 2k3 server with 32gigs of ram and a dual quad core 2.8ghz.

                                  • Re: "Interface Bandwidth" sometimes reports incorrect size??
                                    warbird

                                    Nice specs.  You may be the person I spoke with about this earlier but are you running 64 bit Windows?  Seems you would have to be, to take advantage of all that RAM?

                                    Also, I verified the problem still existed.  We made some changes to some troubling tables in the db.  They will not be fully implemented until db maintenance runs tonight.  Will let you guys know what pans out.

                                      • Re: "Interface Bandwidth" sometimes reports incorrect size??
                                        warbird

                                        It appears this issue was being caused by a couple of tables in the main NetPerfMon db that had issues.  I believe this stemmed from the old wireless module, from an older version of NPM.  The tables in question were the largest tables in my db and had very old data contained within.  We truncated those tables, allowed db maintenance to run on schedule, and the issue appears to be resolved.

                                        If you are having this issue, I recommend you contact support for the most quick solution.

                                        1 of 1 people found this helpful
                                          • Re: "Interface Bandwidth" sometimes reports incorrect size??
                                            bmarms

                                            i had the same issue and logged a support case.  the resolution was to set the custom bandwidth for all interfaces using an SQL query in database manager.

                                            here is the query i used for 100mbps interfaces:

                                            update Interfaces

                                            Set InBandwidth=100000000, OutBandwidth=100000000, CustomBandwidth=1

                                            where NodeID<200 (if you have more than 200 nodes you will have to change this)

                                            and InterfaceName like '%Fast%'  

                                              • Re: "Interface Bandwidth" sometimes reports incorrect size??
                                                whitejcdc

                                                So here now version 10 is out, does annyone know if the interface bandwidth discovery issue has been fixed or not?  I am on 9.5.1 and still dealing with this.  Thing is with mine though is that it is occuring on Cisco 3750Gs, not 3500 series switches. Plus, it only seems to do it when you add the switch into NPM initially with the ports open but not used (i.e. UP/DOWN state) then later on plug something in.  Take for instance one of our 3750G switches just had a new 7201 router (Gig port) uplinked to it on an empty but previously unshut port.  Once connected upon the next re-discovery of that switch, the port is set to 10MB, not 1000MB.  So now it showing incorrect bandwidth and 100+% utilization.  Far as I can tell. many releases have been put out with this not addressed. 

                                                Plus, since I'm on the subject of the discovery tool...some work needs to be done there too.  In the past I've noticed that adding a new node or going into System Mgr in the middle of the day and adding and then issuing a "discovery" of the new node, resets the timer for the global discovery process. So now if you have it set to re-discover normally every 24 hrs, the process would now kick off 24 hrs later frrom when you added/discovered the node the previous day, say around 2pm in the afternoon. Which just so happens to coincide with the highest traffic times in your enterprise.

                                                One thing to suggest would be to put in an actual time of day variable in addition to the repetition time variable for the global discovery setup and make indiviual node discovery not affect that.  Just sayin'...

                                                  • Re: "Interface Bandwidth" sometimes reports incorrect size??
                                                    warbird

                                                    In my case, 3 things lead to this problem.  2 of which I was able to address, 1 of which is not addressable.  The first 2 were load issues and db issues.  My polling engines were being taxed beyond their limits and I had very large tables with very old data in my db that were not rolling up properly (we are currently monitoring ~1300 devices and ~23K elements).  This was evidenced by errors in the db maintenance logs.  SW support helped me address the db issues and I brought another poller online to address the load issues.

                                                    I am now running v10 SP1 and I still occasionally notice a few interfaces reporting incorrect "Interface Bandwidth" from time to time.  As best I can tell, NPM is reporting accurately.  Here is why I think this...

                                                    Certain Cisco switches with ports set to auto/auto will report to NPM as 10M when they are enabled but not active.  If that port goes active, negotiates at 1G, and the client on that port starts pushing a bunch of data, Orion will display what you are describing until the next full statistics poll of that device.

                                                    Is there a way for SW to fix this by moving the "Interface Bandwidth" poll such that it happens at the same time as the Interface Usage poll?  I do not know but I imagine that would heavily increase the overall polling load for something that doesn't occur often if systems are running as they should.

                                                    As previously stated, I recommend you open a SW support ticket.  Have them take a look at system loads and db logs with you.  Perhaps that will help you.  I cannot speak about manually running the discovery tool, as I don't use it.

                                                    HTH.  As always, YMMV.