What Is Your Polling Configuration?

I'm trying to get an idea of what the optimal polling interval and statistics collection interval would be based on the number of nodes and interfaces. I was hoping we could get a thread going with everyone's polling configuration along with the server Orion is running on. One of the main problems with our previous NMS was that it was overtaxed, resulting in a lot of false alarms. Here is what I've got.

 Network Nodes - 722, Interfaces - 1199, Volumes - 483

I've got the majority of my devices configured to poll every 120 seconds and statistics at 10 minutes. Approximately 50 critical devices are being polled every 60 seconds and statistics every 5 minutes.

Quad-Core Xeon 64-bit running Server 2k3 32-bit w/ 4GB RAM.

  • I have two environments one SLX server with 2 Xeon 3.2 GHZ 4 GB ram 6 15K 72.8 GB HDD Raid 5 polling every status 5 Min. SQL is running locally.  The average CPU load is in the 30’s memory is high due to SQL but stays at about 85% used.
    5968 Nodes
    Interfaces8 Interfaces
    Volumes16 Volumes
    The second is Four SLX servers that poll Stats every 5 to 10 minutes depending on the node and status every 5 min.
    6533 Nodes
    Interfaces14399 Interfaces
    Volumes4496 Volumes
     The three additional pollers are 2 way Xeon at 2.93 With 2 GB RAM. These boxes average between 20 and 50 % CPU utilization.  The main poller is a 4 way Xeon 2.93 with 4 GB ram (It runs NetFlow for over 2000 interfaces also).  SQL is running on the corporate Cluster and the DB is on the SAN. This box averages 55% CPU utilization and about 2 GB of memory used.
  • 2070 Nodes -  3252 Interfaces - Volumes     41 Volumes

    Quad-core Xeon 64-bit server (2.66GHz), Win2K3 server (32-bit) w/ 4Gb RAM.   Separate Enterprise SQL Server.  Separate Web Server.

     Polling - 120 sec., statistics - 10 min.    Nothing bumped up in polling.

     Average CPU load, about 30%.   67% physical memory, 36% VM...


    Rarely have any problems...

  • Wow...and I thought we had alot of nodes. Thanks for the info, guys. Anyone else?

  • I've got a main SLX server running the NetFlow module.  That box is a quad 2.8 GHz Xeon with 8 GB of RAM.  It runs between 15% and 25% on the CPU and uses about 1.5GB of RAM.  I also have a secondary polling engine which is a quad 3.0 GHz Xeon with 4 GB of RAM.  This server runs between 5% to 15% CPU and uses almost 2GB of RAM.  I also have an additional web server.  Our polling and statistics collection settings are pretty standard, 120 seconds for polling and 10 minutes for statistics. 

    We have:

    9363 Elements

    2941 Nodes

    6418 Interfaces

    4 Volumes

    366 NetFlow interfaces

    and a partridge in a pear tree (sorry, I coudn't resist, tis the season!)

  • Here's what we have currently, but we are awaiting final approval for a project to upgrade all components and expand the instance to accommodate approximately an additional 8,000 elements.

    NPM Host Server = Dual 3.2 GHz Xeon, 4 GB RAM, Win 2003 Server
    3 Additional Polling Engines = Single 2.4 GHz Opteron, 1 GB RAM, Win 2003 Server (per VM guest)
    SQL Server = Dual Quad-Core 2.33 GHz Xeon, 16 GB RAM, Win 2003 Server (64-bit), SQL Server 2005 (64-bit)

    Network Elements = 13,856
    Nodes = 2,410
    Interfaces = 11,144
    Volumes = 302

    Nodes = 2 Min
    Interfaces = 2 Min
    Volumes = 5 Min
    Rediscovery = 60 Min

    Nodes =  5 Min
    Interfaces = 5 Min
    Volumes = 60 Min

    Data Retention:
    Detail = 30 Days
    Hourly = 180 Days
    Daily = 1,095 Days

  • Update...running SL2000. SQL 2005 DB is running on dedicated Quad-Core Xeon 2Ghz w/ 4GB RAM. Raid 1(OS) & Raid 10(DB).

    841 Nodes, 1238 Interfaces, 505 Volumes

    Node Status: 60 Seconds (95% of my nodes)

    Remaining Nodes: 300 Seconds (5%)

    Interfaces Status: 45 seconds

    Volumes Status: 300 Seconds

  • See my Sig file below for Orion specs.