This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Planning for a new SolarWinds NPM deployment

Hello,

Would like to get a recommendation for hardware plan for a new deployment of Orion NPM.  I have about 1000 nodes I am going to be monitoring via SNMP, I would like to know what is the best hardware configuration for this purpose.

Thanks, please let me know if you need more information.

Thanks,

Adam

  • abrown@weldedtube.com I am moving you over to the NPM thread in hopes to get your recommendation.

    Cheers!

  • Just a comment, stay away from virtual servers.  It's not really the number of nodes you monitor that is important, it's the number of interfaces.  Orion counts every monitored node and every monitored interface as a resource.  So in my companies case, we have 2000 nodes, BUT we have over 150,000 interfaces.  So what happens in this kind of environment is that the server CPU and the server Ethernet interface get tied up with all the polling.  In a virtual environment, which of course a shared environment, the Orion virtual servers start using up all the shared resources.  Go with dedicated 64 bit servers with 32g of memory or more and 8 cpu or more for each of the polling servers.  The database server is another animal.  We're running a Dell R620 with 132g of memory, 24 cpus, SQL standard edition and Windows Server 2008 R2 Enterprise edition.  The Enterprise edition is important because you can't use more than 32g of memory with anything less.  We have 45g of memory assigned to the SQL instance and Orion is the only instance on the server.  Again with our size of data transfer, we're using all the SQL resources.

  • I don't believe an additional poller install can use more than ~8GB of memory on the server -- all of the processes are 32-bits and so crash when they get larger than 2GB. (I have 10GB free on 16GB servers with ~8000 objects on them).  If you stack additional pollers that might be possible to use more memory, but I don't have that kind of install.

    Also, on your SQL server if you only have sqlserver standard edition it will only use 64GB of memory; the rest can be used by the operating system to cache disk reads/writes but not directly by the SQLserver process: http://msdn.microsoft.com/en-us/library/ms143685(v=sql.105).aspx

  • I have pollers running on 64 bit servers with 64g of memory.  While the poller may only use 8g of memory it won't crash if there is more.  Also even though SQL standard edition will only use 64g, the OS will use more than 64g if it is needed.  We installed the original database server with OS standard which only accesses 32g of memory and hit the wall on memory utilization.  So we upgraded the OS to enterprise so it could access more memory.  Memory utilization then dropped from 95% to 9% and performance improved.  So there is more to the memory issue on an SQL server than just SQL.

    Having said all this, no one should be deploying a server that is bare minimum, that's a recipe  for poor performance and pissed off users.  I recognize that some of the stats I mentioned are way high, however, I don't have to go back to the well to get more hardware for quite some time.  In other words future proof the install.  Recognize that the specs the Solarwinds recommends are the minimum specs not the average specs.

  • There are many variables to consider when planning your h/w decisions for Orion implementation. Our implementation is a mix of physical and virtual servers, with current plans to migrate even more to Virtual. I shall explain why. Main engine, is physical 48gb memory, 10cpu. SQL is dedicated physical, 96GB memory/8cpu. We also have 4 additional pollers which are all virtual each with 6GB of RAM. To date we've been extremely happy with performance, mostly because I stay on top of it (you MUST tend to your Orion Nightly job & use caution when changing data retention values upwards. Make sure Nightly is completing in a reasonable time, and push back when management asks you to retain 6 months of details stats, or collect interface statistics at 1minute intervals! These can be your biggest killers to performance. Currently we monitor 3,600 nodes (24,000 elements). Factors in addition to your polling intervals and data retention values, that affect performance are what other modules you have purchased and integrate with Orion. We have SAM, VNQM, UDT & NTA which all contribute to hit on performance, due to increasing DB table sizes. Recommend SQL Enterprise, and of course all servers 64-bit. We are now wanting higher availability, so despite obvious challenges with VM (contention for resources, and VM management not wanting to provision servers to size you really want), the tradeoff of not having to worry about your h/w can out weigh blasting fast server. Another benefit of VMs that I am starting to appreciate as the main administrator of 5 pollings engines and a SQL server, is the ability to perform 'snapshot' just prior to upgrades. That peace of mind that you can quickly return to pre-upgrade state should things go wrong is reassuring and allows me to focus on Orion and not worry so much about OS or h/w. Our SQL server h/w is now out of warranty, combined with new demand to double our SQL server memory to 192GB. BIG Decision time - do we replace our physical server with dual physical (sql cluster) or jump on Virtual bandwagon for our SQL? I can say, that I'm now seriously considering a dedicated SQL server on a VM, so that I can sleep at night knowing the infrastructure is someone elses worry and I can focus on Orion admin. There are many self monitoring abilities within Orion (Appinsight and other templates for Orion polling health) - these all help in building case for right sized VM gear. The recent change with Netflow utilizing flow storage was brilliant and gave us breathing room. Thanks to Solarwinds development folks who keep delivering. Our current painpoint for performance is the mapping. We've had to trim down maps, as these were just too intensive. Sorry about size of my comment - its a topic of high interest for me.

  • Hi epenney. I sent you a friend request because i wanted to pm you a question following what you said about mapping.

    Basically, Im new to NPM and experiencing significant delays using network atlas (non locally) I also jusrt started a question about this if you could shed some light on it for me.