2 of 2 people found this helpful
As a general rule of thumb, an "unlimited" Orion NPM poller license handles up to 10,000 monitored elements--IF you have provisioned the server on which it lives with the appropriate resources (memory, CPU, etc.).
I've seen a poller auto-adjust its polling rate to accommodate 13,000+ elements quite nicely. I lost some granularity, but I didn't have to buy a second poller.
If you have less than 10,000 elements, what's your motivation for a second poller?
One rationale for a second poller might be to show local response time at a remote regional site. That's useful particularly if you have resources local to that region's users, which they'll hit instead of coming back to the home region's resources. However, if all users in a second region must use the first region's data center resources, then having a poller local to the second region may give you a skewed idea of those users' devices' response time. If they have a local poller monitoring them, they may get sub-millisecond latency--according to the poller. But if the users must pass over a distant WAN circuit, their actual latency to resources could be 10 ms or more, and some apps don't work well when latency starts climbing to 20 ms or more.
In my case, I had three pairs of very geographically isolated data centers, and it initially made good sense to have local support, local monitoring at each. As my company grew, management decided to make two data centers primary, and began moving Citrix and dbase and UCS and application resources to those two, which were both in the Region One. At the same time, management elected to eliminate local network support and server staff at regional area two's dual data centers. Lastly, region three's data centers were reduced in criticality, and eventually region three's become more of an off-site backup & storage facility, with limited application & Citrix functionality.
Since the total element count exceeded 10,000 at each of the three regions, it made sense to have a local poller or two in each region. We used EOC to try to get an overall feel for the state of the network in all regions, but it wasn't a good fit. EOC was too slow to update for our purposes, and we remained focused on our individual local regions, which wasn't the best way to support the organization's needs.
To improve my IT Support "team mode", we chose to make each remote region's NPM solution a remote poller that reported to Region One's NPM instance. This worked out well, and now we see all devices that are up or down without a regional bias. It's made us more outage-focused instead of being region focused, and that's a big improvement.
If you have less than 10,000 elements, I'd stay with one poller. If you monitor more than that, I'd budget for a second one. Unless you have a great reason for a second one that has nothing to do with element count.
Add a second poller later, when your growth of monitored elements demands it. OR . . . if you happen to have budget, and can work your SW Reseller for some great incentives in pricing, you may find a financial benefit in adding a second poller now. But I'd save that for when you're getting close to exceeding the 10,000 element point.
You can add nodes manually or via Network Sonar discovery later. It's pretty easy to do. If you want to migrate some nodes off the original poller and into the second one, I believe it's as easy as selecting them and choosing the option to poll with a different poller.
Good luck deciding what's right for you!
Thank you for your in-depth explanation.
Let me give you some background on my environment:
I am a sysadmin at a college and we don't have any remote sites.
We are currently transitioning to a new forest that we have built from scratch and we believe eventually we will have more than 10,000 elements.
As I mentioned, we run everything on VMware and hardware resources shouldn't be an issue.
I will recommend waiting on the additional poller based on your response. However, when I do add the 2nd poller should I do so on the same server with the primary installation or do you recommend adding it to a separate server?
Also, once I have 2 pollers will I have to manually assign the pollers to the nodes or can SolarWinds load balance the nodes automatically?
The only concern I'd have with putting your second poller on the same platform that hosts your primary NPM instance is resilience. If it's OK to put all your eggs in one basket, I see no issue with it. Given the flexibility of V-motion and restores & snapshots, if you have multiple hosts that your application can V-motion to, and all have sufficient memory and CPU cores to allocate to Orion services, you should be good to go.
Regarding how easy it is to move nodes from one poller to another, I've seen in the past versions it was a simple script to run. At a worst case, it would be no more difficult than opening a node and selecting a different poller for it.
There's a good training session that talks about NPM optimization and tuning that you might find helpful: Customer Training: Orion Network Performance Monitor (NPM) - Level 3 - A Webcast from SolarWinds
Thwack member erica.gill shared some information that might also be useful: Erica wrote this back in July of 2012:
" . . . from a network load point of view, we recommend having each polling engine monitor the nodes closest to it in physical/logical terms to cut down on network traffic.
In practice customers consider a few other criteria. Time is actually the big one. While moving nodes between polling engines is easy, there (may also be) an associated need to update ACLs/firewalls, SNMP clients and flow configurations to insure devices are accessible from the additional polling engine. (When you are moving) your switches/routers (to a different poller, it) could be as easy as running scripts in bulk against them.
Keeping all your SAM polling on the Primary poller while trying to keep everything scalable isn't going to work in the long run but it can help in the short/medium term to cut down on the associated work with the initial balancing.
(Some people move nodes to different pollers so their element ration might be ) 40/60, 30/70 or even 80/20 where an initial 20% are moved off the Primary to reduce the load and all new nodes are added to the secondary. This can simplify the process for adding nodes for large teams as well as cut down on the reconfiguration of ACLs and SNMP clients."