Hello,
This is my first posting and I'm new to both Solarwinds and SQL. We are in the process of building the SQL server and I need to know if Solarwinds support the SQL database on a SAN?
Thanks!
I'm running netflow, orion, and NCM on the same DB, as well as some other app's DBs. it is 4 proc/8 core, 32 GB RAM (only using 8-10) and cpu usage is pretty low. I am connected to SAN, and we did run into problems a while ago, but it was resolved by simply configuring the SQL DB memory to utilize AWE.
Test this and you will probably see marked improvements.
Sure does. So long as the SQL server is concerned, the disk the DB's sit on is local so Solarwinds could care less.
Thanks! This is my first install and we are putting the SQL DB on a dedicated server and the question about local disks or SAN disk was onoe issue. Also we are stuck at this time using VMWare server for the polling platform...
Keep your SQL off your Orion boxes like you have planned. Also, no VM for the SQL server, but it is fine to run the polling engines on VM. You're in the right place to ask these questions. Good luck!
Was told recently this is still a no-no by SW. ??? -Debbi
Thanks for the reply!
What we have found when people run on SANs is that they tend to run with a very low disk write capacity. Orion writes a lot of data to disk, and with that configuraiton, they get poor performance, and there's not much we can do to help.
So it's possible to run on a SAN, provided that the SAN is configured appropriately. We are currently working on some metrics to help users gauge whether their SAN can handle Orion.
I am currently Monitoring over 24,000 elements balanced over 4 pollers all on VM. The main poller is also running Netflow with over 2000 interfaces being monitored. The SQL server is on a dedicated Box and the DB lives on the SAN. With this configuration I have not run into any performance issues with the SQL server or the amount of IO that is being sent to the disks.
aliendan: Are you using Cirrus? I just took over administration of our Orion implementation. We have been having extreme performance issues and I am looking at recommending to my boss that we put the SQL db on a SAN. Solarwind support recommended not to do this because of some sort DNS issue (too many look ups), which makes no sense to me.
Our setup: Maybe 4,000 elements, 2 pollers that are each running on a dedicated server, Cirrus running on a dedicated server (but using same SQL db server), SQL db running on a dedicated server. On the SQL db server, I noticed that the disks were always pegged, with the avg disk queue length bouncing between 400 - 700. Discovered that this server was running RAID5 which is "strongly" discouraged per the Orion Admin guide.
I have converted the SQL db to RAID10 and seen improvement but not nearly enough. Without Cirrus running, the avg disk queue length is between 100 - 150. Once the NCM service is turned up on the Cirrus box, the queue length goes to 500 - 600 and the Orion web app starts hanging again.
The SQL db server is a Dell 2950 with 6 SATA disks in raid10 and a perc6i raid controller. The OS runs on the same array as the db.
I could really use some advice on what sort of hardware this SQL db should be running on. From everything I have read, we are throwing plenty of horsepower at it. From what I've seen so far, the disks on the SQL server still appear to be the bottleneck.
In response to warbird,
Yes, I am running Cirrus too. Cirrus has about 4400 nodes that are backed up nightly. Our Cirrus DB goes back to the same SQL instance and Lun that Production Orion sits on. For my companies use of cirrus very little of the app is used during normal business hours. 99% of the work is done at night through backup jobs that spread the backup off all the nodes over about 10 hours. This helps to minimize the amount of load on the server and the disks. What do you have running on cirrus? Are you doing a full inventory?
Are your Disks 15 k or 10 K? And how much cache does the controller have? You should have most of the cache deticated to writes.
aliendan:
The disks are 7200 RPM, 750 GB SATA disks. There are 6 disks total in the array and the OS 'lives' on the same array as the SQL db.
The controller is a PERC 6i w/256MB cache. I currently have the 'read policy' set to "Adaptive Read Ahead" and the write policy set to "Write Back". Here's the kicker... running 'perfmon' on the Windows server shows the 'avg write disk queue length' is rarely exceeding 10; however, the 'avg read disk queue length' normally bounces between 100-150. When I turn up Cirrus, the read queue bounces between 300-400, and the web app slows down unacceptably.
I am still uncertain what Cirrus is doing that would have this effect. In the Orion web app, we keep track of the 'last 5 config changes', and it complains it cannot log into Cirrus when I have Cirrus unavailable. Perhaps that module is constantly running through the cirrus db or something. I'll be researching that later.
I am currently researching moving the SQL db's to one of our SANs. It is an EVA 8000 with more than 200 disks. Even though it is used for other applications/db's, I have been told it should be up to the task of running Orion's SQL db. Any advice/thoughts on this?
njoylif:
Thanks for the tip on AWE. I will look into this.
Ishikawa:
Thanks for the tips. As noted above, our avg write disk queue length rarely exceeds 10. The avg read queue length, however, appears to be the problem. I am still struggling to figure out why it is the read cache and not the write.
What has become apparent is that we are definitely exceeding the capabilities of six 7200 RPM SATA drives and/or this raid controller. The memory and CPU on the SQL server are never taxed much. I would appreciate any thoughts you have on the EVA 8000 SAN I mentioned above.
Warbird have you tried to turn off node monitoring in Cirrus(NCM) We have seen serious performance impact on our solution when running Node monitoring in cirrus. This functionality is best kept delivered from NPM.
/heZ
Hello again. I wanted to follow up on this. I have moved our SQL db to a SAN and things are working extremely well. Granted, it is a very 'beefy' SAN but still, the performance is amazingly better. The only problems I can see from doing this is if you overload your SAN. In other words, it is hardware dependent and also dependent upon your layout/situation.
I am even running NTA now, which was unheard of before. They are looking at a strange memory error in NTA but once that is resolved, we should be good to go.
Thanks again for asking this question and for all the suggestions/answers!