    NPN Poller Per Second - Gaps in data


      We are seeing gaps in data, especially on high bandwidth interfaces (Example 200M -500M on a 1G Interface).  I am trying to figure out of our pollers are tuned right.  I went though this doc but I am not sure what I need.

      Poller Tuning - Community Tips

      1600 Elements
      449 Node Elements
      1147 Interface Elements

      I am noticing that the SNMP outstanding never goes below 545.


      Polling Engine monitor stats:

      ICMP Statistic polling Index 3034 out of 3034
      SNMP Statistic polling Index (1100-2600) out of 3034

      SNMP Statistics  PPS: 148
      Max SNMP Statistics  PPS: 148


      The Polls per second tuning states the recommended:
      Maximum Node and Interface Status polls - 64
      Maximum Statistics Collections = 105

      We poll our interfaces at 1 minute statistics collections
      We poll our Node Stats (CPU/MEM) at 5 minute statistics collections

      Any ideas? 

        • Re: NPN Poller Per Second - Gaps in data
          Andy McBride

          You are probably over-sampling the statistics data. By increasing the polling ~10X you deminish the scalability of the Orion server ~10X. The adjusted element count (for polling frequency) is ~15K and pollers don't do well with more than ~9K.

          I suggest carefully examining the need to poll so frequently. If there are critical interfaces or nodes where you believe you need to poll at 1 min I would set those up for 1 min (assuming there are just a handfull of these) and return the polling for all else to the defaults.

            • Re: NPN Poller Per Second - Gaps in data

              Our requirements are at least 1 minute for interface statistics and would really want more granular than 1 minute if available. a 5 minute average does not show nearly the bandwidth peaks in a bursty traffic environment.


              If we need 1 minute statistics would an additional poller help?   When these gaps happen on the busy interfaces other interfaces do not show gap at all.  IS it the case that Orion gets too busy and can not calculate the large bandwidth numbers when its busy?

                • Re: NPN Poller Per Second - Gaps in data
                  Andy McBride

                  I recommend you run a test on two similar interfaces and poll 1 at 1 min and 1 at 5 min. I have done this to demonstrate to clients that the data is almost identical. You can see shorter peaks with 1 min but the question there is what is the value of seeing these peaks? Short peaks with a rapid recovery are common and don't impact performance.  It is a trade off between the cost of rapid polling and the value of the data. For LAN interfaces 1 min polling has no value, so you could set the WAN interfaces to 1 min and keep LAN at 9 min and save a lot of $ in extra pollers and data storage. OK - I'm off my soap box now.... ;)

                  For the high speed interfaces make sure you are using 64 bit counters. This setting is in the admin manage nodes interface and is set per node. This will keep the counters from experiencing rapid roll-over.

                  Your adjusted element count for 1 minute polling is about 8 to 10 times greater than the actual element count so an additional poller will off load the polling and eliminate the gaps. Here is a doc on gaps and one on Orion Performance that may be helpful.