This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Any large customers willing to share how Orion has scaled in their environments and what machine requirements you need to run it properly?

FormerMember
FormerMember

We are looking at NPM/APM and I understand the pollers, but one of our concerns seems that 20000 monitors and we are bumping into SQL performance concerns and with over 4000 nodes we are wondering how folks are dealing with change management and the tools to manage this size of environment.  If you factor a scale of about 40,000 total monitors with between 1 and 10 minutes ( I know the devil is in the details here), but would anyone be willing to have a discussion with me if your size is similar?

  • If want you can bounce some questions off me. I am one of the admins for our NPM & APM setup in our data centers.

    Here is a link to our site http://www.cosentry.com/solutions/network-monitoring-management.aspx

    Our setup is not as large as your setup. We are polling around 5000 item every 2 minutes. And with a shared environment we have over come a lot.

  • We don't monitor as many nodes as you, but we monitor 40,000 elements over 800 nodes.  I'd be happy to share our experiences with SW in our environment if you like.

  • FormerMember
    0 FormerMember in reply to mromeo

    There is a few of areas that concern us:

     

    1 - SQL performance, it seems to beat the heck out of SQL - we have had to bring up a dedicated box with 14 drives raid 10 just to get 1/3 of this amount online.

    2 - The mgmt interface seems lacking when working with so many rules and elements - it becomes hard to find things and there doesn't seem to be a way to save filters so you can quickly find things each time you go to do a task.  Are we missing something here?

    3 - How many physical boxes and what specs are they for your environment just to support APM/NPM and SQL for it?

     

    Thanks tons for your help!

  • FormerMember
    0 FormerMember in reply to jeff.stewart

    1 - SQL performance, it seems to beat the heck out of SQL - we have had to bring up a dedicated box with 14 drives raid 10 just to get 1/3 of this amount online.

    2 - The mgmt interface seems lacking when working with so many rules and elements - it becomes hard to find things and there doesn't seem to be a way to save filters so you can quickly find things each time you go to do a task.  Are we missing something here?

    3 - How many physical boxes and what specs are they for your environment just to support APM/NPM and SQL for it?

     

    Thanks tons for your help!

  • I have seen a box they sell a Sigma Solutions that has SQL running partially on RAM Disk. It's a Sun box running windows.

  • Would you be able to talk on a call with some design and questions about deployment with Solarwinds?

  • We don't do that directly but you might inquire with one of our partners, Sigma Solutions Online.

  • I have dealt with Corona Technical Services for my training and consulting needs. I found them to be knowledgeable and very professional. You might want to check them out at www.coronaservices.net.

  • Yes - Corona is quite good too.

  • We are polling 125000 interfaces across 2000 devices.   We have on main server and 3 more polling servers.  We run the database on a fifth corporate database server running SQL 2008 Enterprise.  The database is the only one on that server.  The database is at around 20 Gbytes in size.  We only store data up to nine months back and we store events up to 90 days.  The main servers are dual core and have 4G of memory.  We have about 20 people a day accessing the system.

     

    The biggest problem we have is getting service to stop on all the servers when we need to reboot or upgrade.  Also network wide reports won't run because the reports reach the web server time out value before the complete.  This even occures using direct reports from the polling server using the console program.

     

    Also we do NOT run discovery.  The discovery tool does not have enough filtering to keep from adding duplicates to the database