Below are answers to questions asked during the recent webinar. If you have additional questions, please contact Brian Flynn at brian.flynn@solarwinds.com
Q: You mentioned DisKeeper. What's the thinking about running it on a SAN such as Dell Equallogic? Or for any defrag tool on any SAN...
A: Sorry, I don’t know enough to authoritatively comment on DisKeeper. I’ve never used it myself. I just know of it. I’ve never used a 3rd party tool for this myself. For logical frag, I’ve always used vendor features. For physical fragmentation, I’ve taken the DB offline and forklifted files one by one to a freshly formatted volume. Tedious, but effective.
Q: How can we convert SAN admins from the dark side?
A: Ha ha! Kill ‘em with kindness. A Dale Carnegie class taught me that the way to become interesting to people is to get them talking about what they care about. They could spend the whole time talking, and at the end of it think that you’re a great conversationalist. Likewise, I think people are impressed when you learn about their world. If you went to France, it would behoove you to learn a little French as a sign of respect. They might just be slightly less annoyed that you’ve come to their country expecting them to speak your language, even if they do speak your language. So find out a little bit about what matters to them. Show genuine interest and demonstrate effort to learn what matters to them. Then I think they’ll be more likely to reciprocate
Q: I have trouble convincing my department to add Confio to the mix, as it required yet another database server. We have Solarwinds, though. Does the merger mean that the Confio capabilities will be integrated (as an add-on?) into the Solarwinds installation?
A:We’re not permitted to make promises about specifically what will come in specific versions, but I’ll say that we are working on integration between the products. That said, we’d have to look at your particular setup. You likely could add DPA without buying a new DB server, but I’d need to hear more about your constraints to be more specific.
Q: I have been wondering about table fragmentation for SQL Server. Any reference on the method to approach this?
A: Lots of resources. I’d need to know more about what you want to know but if you want a solution, Ola Hallengren has posted scripts for database maintenance that has been very popular: http://ola.hallengren.com/
Q: SAN vs NAS-- HBAs have usually up to 8gbps-- NAS -- I can have 10gbps connection to storage array-- why it should be the weakest link?
A:Protocol overhead. iSCSI has got a lot of overhead. FC is more efficient. Here’s a good write up on it: www.emulex.com.
Q: Are there any network settings that are preferred for SQL Server to improve network throughput?
A: I haven’t typically heard of this so I can’t claim an authoritative position, but I did a quick search and it seems like there’s nothing specific to SQL Server. It’s probably about the same as any high network traffic. Disable power management options. Use a fixed speed duplex and optimize it with the network switches. If I were tasked to optimize the throughput at the network level, I’d probably construct some load test for the network I’m transmitting across and watch the network utilization. If you can’t get past 50%, you might have a duplexing limitation. You’ll need to be sure that the load test you construct can load the pipe with data fast enough. I wish I had more for you. This is a difficult question and may be worthy of its own presentation.
Q: Is defrag really an issue now given all the tiered storage?
A: Yes, the weight of the reason for it varies. With HDD, having the data contiguous to avoid head and rotation latency is huge. With SSD, defragmenting to eliminate wasted space, thus consolidating data into fewer pages is key.
Q: Which RAID is better for SharePoint DB server?
A: The answer probably varies on your utilization of SharePoint. For workloads with lots of reads and writes, RAID 10, but that may be too cost prohibitive. RAID 5 may be just fine. The answer depends on this: Is what you have good enough? If not, what performance needs to change? Assuming you need redundancy, you’re picking between RAID 5 (or RAID 6) and RAID 10, where RAID 10 will give you better write performance. If you’re fixed on the number of disks you can use, RAID 10 will yield a smaller volume and read performance won’t be as good as the RAID 5, which stripes across more disks because in RAID 10, you forfeit half of your disks to redundancy implementation via a mirror of the other half.
Q: Multiple questions about formatting disks to do IOPS at 64KB as recommended by MS?
A: It’s still a good practice to optimize your LUN to the 64k boundaries.
Q: What tools can be used to measure IO?
A: Quite a few. I think this question came in before I got to it in the presentation.
Q: Can you think of any negative effects of 100% write SAN cache?
A: I’m assuming you mean allocating 100% of your SAN cache to writes. If not, follow up with me and I’ll take another swing, but here goes… If you allocate all of the SAN cache to writes, you may be forfeiting all of the benefit of SAN cache because SQL doesn’t wait on writes to data files. It keeps dirty pages in memory and the lazy writer comes along later to persist that to disk unless a checkpoint command is called to force it. On the other hand, if you’re talking about log files, transactions can become held up because of waiting on writing to the log so SAN cache should be a huge help to transaction time. Overall, if you have no read cache in your SAN… well… you’re just not going to have time to read benefit like it would otherwise… Generally, I like when SAN cache makes my reads faster. I just don’t like the inconsistent performance.
Q: What is the name of the tool to measure IO besides SQLIO?
A: You might mean SQLIOSIM. It’s not really a tool to measure IO as much as a tool to simulate SQL like IO for stability testing.
Q: External fragmentation is considered when next page is before than actual pages. Read Ahead is affected when next page is after actual page, but fragment count is close to total page count? I’m in, all pages are 100% full, next pages are always after actual page but have spaces between pages. The question is, how read ahead works in this scenario?
A: I don’t think you’ll ever see one page of table 1, then one page of table 2, then another page of table 1. You can, however, as I demonstrated, have one extent of table 1, then one extent of table 2, then another extent of table 1. If you have that situation, read ahead can’t help you. Of course don’t take my word for it. Install Process Monitor and see for yourself. If read lengths exceed 64K, you’ve got read ahead. If not, you don’t.
Q: If I have report Call Stored Procedure and do not use any Ad Hoc Query this SP take 0 SEC but the Report take 40 Sec so how can I know the issue or from where this Slow While the SP take 0 Sec
A: I'm not entirely sure what this is asking so I will make some assumptions. If the proc completes in 0 seconds and the report takes 40 seconds, then it's the code rendering the report. Though sometimes, if the result set is huge, you might have a blind spot in your means of measurement. If you tested running the proc on the server and it completes in 0 seconds, but running it from another computer and having the result set transmit over the network, then you may have a network delay. This is a perfect use of Database Performance Analyzer! It will break the query run time down into distinct wait types so you know how much was IO, how much was memory/cpu and how much was network transport. And that should be the end of the conversation on the performance of the proc.
Q: I've got a 24 disk array with 300 gb 15k rpm hard drives in raid 10. The disk I/O is always very low, but my processors (Dual Xeon E5-2665 8 core 2.4 GHz) are always maxed out by the SQL service. Is the only answer to get more processors, or can database fragmentation be a possible cause? I am running Diskeeper on the database server, but I don?t believe Diskeeper handles SQL Server database fragmentation. Also, how do I determine how severe my 293 GB databases file fragmentation is?
A: Difficult to say because sometimes CPU time is another problem masquerading. My first step would be to identify what is causing the CPU. If Task Manager reveals that it's SQL Server, Database Performance Analyzer is a great tool to find the queries causing the CPU load. Find them. Optimize them. You may need to adjust settings that control the thresholds that determine when to parallelize and how much to parallelize. If you see a lot of CXPACKET waits, it's a strong indicator that this is a suspect. You can also use a query hint to limit parallelization. After all of that, if you still have a problem, you may need larger CPUs.
Q: Is the behavior of SSD comparible to Fusion IO Flash Drive?
A: I haven’t seen an SSD drive that’s as good as a Fusion IO Flash Drive, but that’s an answer that may change with time.
Q: Is support for Postgresql on your roadmap?
A: Not at this time. It has been requested. I saw such a request on a Thwack comment recently that was in response to a request for MYSQL support, but no feature request submitted for Postgres. Please post it as a feature request there. Thwack is influencing what we put on our roadmap.
Q: Are there any gains to partitioning a portion of the SAN for only databases and set them to RAID 10?
A: Generally speaking, yes… but! What you do with that RAID 10 array is more important. With RAID 10, you have a high bar to justify the number of disks you need to feed it. I don’t use RAID 10 unless write performance is a problem and with enough server memory and SAN cache dedicated to writes, until you can show that the write performance is holding you up, you may not need it. The harder question is how do you decide? Shooting from the hip here (this needs investigation), I think you might judge that by how often you’re seeing checkpoints which is probably a good indicator that SQL Server needed to free up RAM so it had to persist all of the dirty pages. If that happens, your first resolution might be adding more RAM. Another thing to take into consideration is how many identity fields are you using and are inserts happening on the same table at a very high rate of concurrency? If so, you may see page latch contention and faster writes would help, but you also might stop using monotonously incrementing clustered indexes… I know, I told you they cut down on fragmentation, and they do, but where lots of inserts are pouring into a table, the page latch contention can become a problem because multiple inserts are wanting to write to the same page. If this is your problem you may investigate ways of allocating swatches of clustered keys to the various processes inserting so that concurrent inserts aren’t fighting over a page. Generally speaking, TempDB is a good application of RAID 10 as is any database that sees about as much writes as reads and the IO is high.
Q: On a 16 TB SSD SAN is there any reason to seperate Temp, MDF and Log I/O's
A: Yes, to avoid fragmentation. Or possibly having separate disk queues may help. But, the benefit may not offset some reason you have to put them all on the same volume. Sorry… it depends. 
Q: Does fragmentation really matter if database is primarily OLTP?
A: Sorry… it depends. The simplest answer I can give you is if you’re forfeiting read aheads, then you’re forfeiting performance. How much performance depends on your storage. If your OLTP queries aren’t usually working with more than an extent at a time, it doesn’t matter. If your queries are using data where the data is not going to be contiguous even after defragmenting, then you’re not going to get read ahead anyway.