Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Performance and Disk I/O questions 2

Status
Not open for further replies.

Ovatvvon

Programmer
Feb 1, 2001
1,514
US
Hello all...

This is going to be a performance question. I'll start with the question, and then move into the description of our situation.

I'm wondering how many disks I will need to add to a RAID 10 array (given our setup) to solve the problems we're encountering with Disk I/O contention. (I'm not sure how much I/O can be resolved by adding another disk, or two disks, etc, to an array.)


Presently we have a server where one client is running a Proof of Concept / PILOT. There are a bunch of other clients on this box at the time, however, the majority of I/O is coming from this one PILOT application.
Code:
[b]Server Description[/b]
Windows Server 2003 Standard (x64) w/ SP1
SQL Server 2005 Standard (x64) w/ SP2
CPU: 4-Way, Dual Core  (total: 8 cores) @ 3.16 GHz
Memory: 8 GB 
User Databases (MDF Files) on E:\ drive
E:\ drive is a RAID 10 : 8 disks total @ 15k rpm.
User Databases (LDF Files) on G:\ drive
G:\ drive is a RAID 10 : 4 disks total @ 15k rpm.
TempDB MDF File on I:\ drive
I:\ drive is a RAID 1 : 2 disks total @ 15k rpm.
TempDB LDF File on K:\ drive
K:\ drive is a RAID 1 : 2 disks total @ 15k rpm.
(FYI: C:\ is on a RAID 1 over 2 disks, and that is partitioned with F:\ which contains the SQL Server installation files (15k rpm).  All Database Backups are run to the J:\ drive, which is a RAID 5 spread over 4 disks @ 10,000 rpm.)

Our Data drive (E:\) gets the following I/O without this PILOT running:
Average: 1.3 per second
Peak: 33.8 per second
When the PILOT is running, that raises to:
Average: 36.3 per second
Peak: 380.9 per second
When the PILOT is running and running reports, it goes to:
Average: 43.8 / second
Peak: 484.3 / second

As you can see, average raises from 1.3 / second to 43.8 / second, and peak goes from 33.8 / sec to 484.3 / second. Huge jump!

Our Log drive (G:\) without the pilot running:
Average: 4.8 / sec
Peak: 620.5 / sec
When the PILOT is running with Reports being generated:
Average: 26.1 / sec
Peak: just under 800 / sec

Again, decent jump here as well. The Peak was already pretty high, but this data sample was only taken for about 30 minutes, and the time the application was not running was only 5 minutes, so there was not much of a window to get a large sample of data. (So I'm not sure how entirely accurate the numbers are...this is just a rough estimate.)

Conclusion
We have seen major response time issues on this server. (No surprise.) As they move into another environment (i.e. Q.A.), I would like to put them on their own server, and perhaps try to get more spindles on the arrays. The problem is I don't know how many spindles should be added to compensate for amount of I/O there is.

Does anyone know how much I/O can be alleviated by adding each additional disk to a RAID 10?





-Ovatvvon :-Q
 
I'm not sure how I/O can be alleviated by adding another disk but you could alleviate some I/O by adding another datafile and another disk. The end result would be a separation of your table I/O from our Index I/O. There would also be a gain in CPU I/O because of SQL Server’s multithreaded architecture. It will use one thread per database file to perform concurrent I/O operations.

- Paul [batman]
- If at first you don't succeed, find out if the loser gets anything.
 
You will need to check with your hard drive vendor as to how much IO each drive accounts for.

You may get better drive performance by chaning from RAID 10 to RAID 5. With an 8 drive RAID 10 you only get the performance of two hard drives Look at the image on this link but only look at the top two rows as the image is for a 16 drive RAID 10 array.

If you move to a RAID 5 array you get the performance of 7 of your drives Look at the image on this link, but extend it out to 8 drives instead of 5.

For the Log drive you might want to switch out to a RAID 0+1 array. This will give you the IO performance of two drives instead of the 1 that you currently have.

I know that our main OLTP system here will excede 3500 IOs and we have 15 SAN drives behind it (setup in 3 RAID 5's which are then setup in a meta lun, which is kind of like striping the RAID 5 arrays). Back when it was in the 2000 IOs range we only had 5 SAN drives behind it (in a RAID 5) and it was fine from a performance point of view.

Have you accounted for a hot spare in your config?

Is this local storage, SAN storage, etc?

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 
Hi Paul,

That is a good idea. I will keep that in mind depending on what happens with everything!


Hi Denny,

While the site for 0 + 1 looked accurate, I think the layout for Raid 1+0 (10) on that site may not have been accurate - at least from what I can tell.

Comparison:
RAID 10 = Stripe of Mirrors. Performance (I believe) should be equal to that of all disks. I think the information can be retrieved from either side of the mirror in the array. If I am correct in thinking this, then performance would be on all 8 disks for my Data drive, and 4 disks for my Log drive. If that is not the case, then I think it would have to be performance is affected by half the drives, as the stripe is on one out of every two disks (as the other disk is a mirror), in which case I'd have 4 disks for my Data drive, and 2 disks for my Log drive.

Either way, I think that while RAID 5 should have very good read performance, it is supposed to be slower in write performance than RAID 10, due to the parity information that has to be written on top of the normal data writes. RAID 5 does tend to be more convenient and affective for reliability, but runs a bit slower (from everything I have heard and researched) than RAID 01 or 10. I could be wrong in this, but I've also heard that performance is virtually the same for RAID 10 and 01, but 10 is supposed to be more reliable (i.e. have greater chance for data maintainability in the event of multiple drive failures.)

Again, I could be wrong in this (I'm not a storage expert, and I'm only going off what I heard and could find on the internet, like at the link above). If I'm wrong in all this, please correct me. (I have no problem with correction - I like to learn. Better to learn than to be wrong!)

If all the above is true, then I think they'll probably stick with RAID 10, but I'd need to know how many more disks we'd need to add to affectively lower the I/O contention we're experiencing. I just don't know how to figure out how many more disks we'd need.

Oh, also, to answer your other question, this development box is configured with internal drives (and external expansion chassis for the arrays). The production server(s) may or may not be clustered and on SAN, just depending on which route I decide to take with this...which I'm trying to figure out what I want to do. I know SAN can handle a lot more I/O (although that is RAID 5) due to the increased number of disks (I think ours are in the ball park of 30 to 40 disks), they also have a lot more servers on them.

Thoughts? Ideas? Corrections? I'm all ears! Thanks a bunch for your help! :)



-Ovatvvon :-Q
 
What I understand of RAID 10 you don't get access to the mirrored drives (just like in a RAID 1 array you only get access to the performance of a single drive). As this is 1+0 you should only see performance of half of the drives.

While RAID 5 is slower on writes, most queries to a database are reads (usually in the realm of 90%+ depending on the application and database design).

Check with the drive manaficture for the amount of IOs that the drives can handle each.

SAN storage can be configured in any RAID level that you want. Most every SAN out there supports all the standard RAID configurations (1,3,4,5,6,1+0,0+1,etc) as well as some much sigher end options such as 3+0,5+0,6+0, etc.

SANs also support higher throughput to the disk due to the Gig+ fibre backend that connects the server to the SAN and the all fibre setup within the SAN.

You can also greatly increase write performance on a SAN by adjusting the write cache settings for the SAN.

Do you know what kind of SAN it is?

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 
While Logs are on a seperate drive have you looked to moving non-clustered indexes to a seperate file group? this would give you a performance gain in that all of the writes for your indexes would be moved off of your data drive.




Paul
---------------------------------------
Shoot Me! Shoot Me NOW!!!
- Daffy Duck
 
Hey Paul, we haven't looked into moving off the indexes in this particular situation, but we have done that in the past. This would mean adding additional drives, which we may also do if it is necessary. Thanks!


Denny,

I'm learning a bit more about SAN; this is good. Thanks for explaining the drive-availability for reads in a RAID 10. I actually just learned that they will have a few SAN's with RAID 10. I'm pretty sure our new SAN's are EVA 8000's. This particular client I'm working with is writing a lot more than most app's. It is a data collection application, that collects data from our entire infrastructures network. Pretty much anything traveling over our network, it monitors and logs in the database, where it will be maintained for up to 5 days (or 60 days, depending on the data type). So write performance is important with this client. I also just heard the SAN can handle something like 7,700 I/O's per second. That may just be the better option afterall, as long as they monitor the I/O's and don't put too many servers the EVA's this server will be attached to.

I also heard they're testing a "64k offset", which is supposed to have significant performance gains over the standard "63k offset". So, we'll try some testing with that as well. Do you have any experience with that?



-Ovatvvon :-Q
 
Yes aligning the drive when creating the partition will give up to a 40% improvement in disk performance.

If the SAN it self can only handle 7700 IOs a second that actually a very low number. I'd have them double check that number.

A database that is write only database will have to do a whole lot of writing to disk to max out the IOs of a SAN drive.

I've got a database which logs every network connection from the LAN to the internet and from the internet to the LAN. It rights several Gigs of data per day from the routers and has minimal IO requirements.

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 
7700 was just a number I heard. I don't work in that department, so, shows what I know. I heard back from one of the storage managers, and apparently the EVA 8000's can handle up to 210,000 I/O's...Who knows if that is practical, or only theoretically, but I'm sure the actual throughput is much much higher than 7700 / second. I guess our Diskgroups contain either 44 or 48 disks in every group. SAN is looking better now, so long as they monitor overall I/O among all the servers connected - which I'm sure they do.

This particular app has two databases, one that logs 2 GB per day, and the other is about 22 GB per day...all in all, about 24 GB / day for this one application.

Anyway, this is all looking like it will probably workout...just wish we had Dev and QA servers on SAN for testing, but we don't. We'll see how it goes.

Thanks for all your insight / advise!


-Ovatvvon :-Q
 
No problem.

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 
Certainly doesn't look like you're really using the disks much from the numbers you posted. For example, I just looked at one of my old servers and it's puttering along at 1500 i/o per second right now which is a very slow time of day......

Post the logical disk and physical disk %times reading, writing, both counter from perfomn. If the logical or phsyical disk isn't operating at >90% total utilization then disk isn't a problem.

You're far below any i/o per second worth worrying about.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top