I'm trying to get an idea of what the optimal polling interval and statistics collection interval would be based on the number of nodes and interfaces. I was hoping we could get a thread going with everyone's polling configuration along with the server Orion is running on. One of the main problems with our previous NMS was that it was overtaxed, resulting in a lot of false alarms. Here is what I've got.
Network Nodes - 722, Interfaces - 1199, Volumes - 483
I've got the majority of my devices configured to poll every 120 seconds and statistics at 10 minutes. Approximately 50 critical devices are being polled every 60 seconds and statistics every 5 minutes.
Quad-Core Xeon 64-bit running Server 2k3 32-bit w/ 4GB RAM.
Update...running SL2000. SQL 2005 DB is running on dedicated Quad-Core Xeon 2Ghz w/ 4GB RAM. Raid 1(OS) & Raid 10(DB).
841 Nodes, 1238 Interfaces, 505 Volumes
Node Status: 60 Seconds (95% of my nodes)
Remaining Nodes: 300 Seconds (5%)
Interfaces Status: 45 seconds
Volumes Status: 300 Seconds
Here's what we have currently, but we are awaiting final approval for a project to upgrade all components and expand the instance to accommodate approximately an additional 8,000 elements.
NPM Host Server = Dual 3.2 GHz Xeon, 4 GB RAM, Win 2003 Server
3 Additional Polling Engines = Single 2.4 GHz Opteron, 1 GB RAM, Win 2003 Server (per VM guest)
SQL Server = Dual Quad-Core 2.33 GHz Xeon, 16 GB RAM, Win 2003 Server (64-bit), SQL Server 2005 (64-bit)
Network Elements = 13,856
Nodes = 2,410
Interfaces = 11,144
Volumes = 302
Nodes = 2 Min
Interfaces = 2 Min
Volumes = 5 Min
Rediscovery = 60 Min
Nodes = 5 Min
Interfaces = 5 Min
Volumes = 60 Min
Detail = 30 Days
Hourly = 180 Days
Daily = 1,095 Days
2070 Nodes - 3252 Interfaces - Volumes 41 Volumes
Quad-core Xeon 64-bit server (2.66GHz), Win2K3 server (32-bit) w/ 4Gb RAM. Separate Enterprise SQL Server. Separate Web Server.
Polling - 120 sec., statistics - 10 min. Nothing bumped up in polling.
Average CPU load, about 30%. 67% physical memory, 36% VM...
Rarely have any problems...
I've got a main SLX server running the NetFlow module. That box is a quad 2.8 GHz Xeon with 8 GB of RAM. It runs between 15% and 25% on the CPU and uses about 1.5GB of RAM. I also have a secondary polling engine which is a quad 3.0 GHz Xeon with 4 GB of RAM. This server runs between 5% to 15% CPU and uses almost 2GB of RAM. I also have an additional web server. Our polling and statistics collection settings are pretty standard, 120 seconds for polling and 10 minutes for statistics.
366 NetFlow interfaces
and a partridge in a pear tree (sorry, I coudn't resist, tis the season!)
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process.