We need a way to ensure that all devices associated with a specific UnDP are not polled at the same time.
Our network consists of 800 or so devices that we need to poll for certain parameters on an hourly basis. The problem we have is that there can be up 96 devices that share a last mile radio link limited to a max of 32kbps. When we set up UnDP to poll all of the OID's that we require to monitor (Approx 25 values per node), the polls all go out at the same time to all the devices and floods the link. I've tested this with 30 or so devices that were sharing a 32k link and found that the whole bearer falls over for several minutes while this is occurring, affecting realtime system critical traffic.
My first thought was to stagger each poller by about a minute,
For example
Poller 1 interval - 60 Minutes
Poller 2 interval - 61 Minutes
Poller 3 interval - 62 Minutes
etc.
This means we are still polling ALL devices at the same time, but we are only polling one OID, greatly reducing the data requested. This was an improvement, but we still had packet loss on the link. The other problem with this method is that eventually the intervals will start converging and at some point will all sync up before staggering out again.
When I raised this with Solarwinds Support their suggestion was to remove the interval from the UnDP's, which meant that the nodes will poll its associated UnDP's based on its "Next Poll Time" in NPM. Assuming I'm adding these devices one by one, this could resolve our issue. We would just have to be careful not to add these devices through a "Network Discovery" which would cause all the "Next Poll Times" to be the same. This was a promising option until I decided to do a bit of testing. I shut down the NPM server overnight and let the "Next Poll Times" expire. Once I rebooted, I found that all devices had been polled when the server rebooted and therefore causing the poll times to synchronise again.
At this point I have no other option but to stagger the polling intervals and perhaps adding extra pollers for each OID and distribute them evenly throughout the nodes.
My two suggestions
A: When editing a UnDP, I'd like an option to poll every X minutes, plus or minus Y minutes at random from previous poll.
I.e Node is polled at 6am, next poll will be after an interval of 60 minutes +- 5 minues (random). This would ideally randomise per device per poller.
B: Have a persistent "Next Poll Time" for a node based on when it was added to NPM to avoid resetting to when the server last booted.
Bonus Request: When nodes are added via Discovery, randomise the "Next Poll Time" in some way.