My old laptop is faster then my PROD :) 4

evilDBA · Oct 4, 2007

Take the script I provide and execute it several times on any database. Take the best result (it returns the elapsed time in ms)

On my old laptop I get about 600ms. On the best PROD server I had 430ms - but it was with a standard RAID drives.

On any SAN EMC system it was a real nightmare. 800, 900, in some cases 1800ms!

Yes, I know why: because SAN EMC and other companies like it have a very good marketing. They manipulate with throughput numbers, but for SQL server, especially for the OLTP systems, reactivity is much more important.

They claim that 'our IO systems are so smart, that you can put LOG and FILES on the same drive'. And usually, there is no choice.

People who buy servers are hosting listen to that marketing blah-blah-blah and we, DBAs, need much later deal with high Disk Queue values.

If you can , provide the execution times of my script here. May be you have a good IO system and I was just so unlucky?

---------------------

set nocount on
GO
create table _Ptest (n int identity, k int not null, v varchar(128))
GO
insert into _Ptest (k,v) select 1, 'this is a test'
GO
declare @t datetime, @n int
set @t=getdate()
set @n=16
while @n>0 begin
insert into _Ptest (k,v) select k+n,v from _Ptest
set @n=@n-1
end
checkpoint
select datediff(ms,@t,getdate())
GO
drop table _Ptest

mrdenny · Oct 5, 2007

They claim that 'our IO systems are so smart, that you can put LOG and FILES on the same drive'

I've never heard anyone from EMC say anything of the sort. It is always keep separate IO separate. (This can change slightly is you are using a Symmetric, but with the Clariion line this holds true).

When I run this against my fairly busy SQL 2005 database server I get a range of times from 733-1766. This is on EMC storage, but we host our servers at a hosting company and don't have any control over how the LUNs are setup or what else is going on with the LUN.

One reason that I can see your laptop being faster than the production server is that the laptop probably doesn't have anything else hitting the hard drive while the query is running, while the server has the production load running against it.

There are several things which will effect how the disks perform.
1. What else is on the LUN?
2. How many disks are in the RAID Group, and what is the RAID level?
3. What other LUNs are in the RAID Group, and what are they doing (For example, Exchange and SQL shouldn't ever be placed in the same RAID group).
4. What's the block size set to?
5. Was the volume aligned corrected when it was created?
6. What is the current disk queue that the OS is reporting?
7. What is the current queue that the SAN is reporting?

High end storage is not a fix all. It has to be laid out correctly, with thought put into where and how the LUNs and RAID Groups are setup. If the person / people who are setting up the storage do not want you the DBA to be involved in the storage setup you need to talk to your manager and get your self involved.

This article may provide you with some insight.

http://searchsqlserver.techtarget.com/tip/0,289483,sid87_gci1262122,00.html

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]

http://www.mrdenny.com

evilDBA · Oct 5, 2007

Thank you, I'll read the article.

It would be interesting to know the best execution time on your PROD, say, during the calm night time. I bet it will be around 600-700ms - still worse then my laptop

Quick question. What do you mean by separation of a LOG and DATA on a SAN disk? Say, if I have L: and M: both assigned to the same SAN, does it make any sense? Usually they are just different LUNs, but the same physical device, the same physical channel, the same driver queue, hence, it is just an illusion of DATA/LOG separation.

mrdenny · Oct 5, 2007

We really dont have much of a calm period. I will try to remember and VPN in and run your code again.

The two drives should be assigned to the same SAN. However they should be created on two different RAID groups. The data volume on a RAID 5 or RAID 0+1 (depending on your throughput needs) and your log volume on a RAID 1 or 0+1 (depending on your throughput needs).

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]

http://www.mrdenny.com

jsteph · Oct 5, 2007

Evildba,
I've heard similar things--that you can treat the SAN as a black box (which it is, in our case, with a shark fin atop)--and never have to worry about carving out separate tablespaces or files for indexes, etc.

But our prod server with a raid array shows fully 4 times faster in several disk i/o benchmark tests than our dev server, which is connected to the SAN via twin gigabit hba cards.

In your test though, the prod server returned consistently in the high 400's, while our dev server did consistently mid 1400's. Prod was at the time chunking along running our ISA db, which is constantly writing at a rate of about seven or eight 3-KB records per second (about a gig per workday).

I think the SAN is good for file storage, backups, etc, but for databases, I prefer a set of local platters to a SAN for the time being.
--Jim

mrdenny · Oct 6, 2007

If you SAN is running that slow, it probably isn't configured correctly. I've had SAN volumes which can function easily with 10,000 IOs per second. SAN volumes with enough spindles can handle IOs much higher.

It's all about having enough disks, of the correct type, in the correct RAID group, shared with the correct other LUNs and services.

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]

http://www.mrdenny.com

evilDBA · Oct 7, 2007

>It's all about having enough disks, of the correct type, in the correct RAID group, shared with the correct other LUNs and services.

But I guess no matter how you configure it, SAN will always have longer 'roundtrip' time to disk in comparison to a local disk. Reactivity will be lower - and it is critical to the LDF files.

mrdenny · Oct 8, 2007

How can you claim that the SAN will have a longer round trip? Local storage is copper to the disks which have a maximum potential throughput of say 480 Mb. SAN has an all fiber backbone which is 1 Gig fiber from the HBA to the switch (at least), and at least a 1 Gig fiber link from the switch to the storage. In a typciall setup it will be more like this:

Dual 1 Gig links from the host (server) to the switchs.
Each switch has a 10 Gig backbone within the switch.
Each switch has 4 2 Gig links from the switch to the storage giving each switch 8 Gigs of bandwidth to the storage.
The number of fiber loops from the SP (head) to the DAE (drive shelf) depends on the model.
CX200, CX400 and CX600 have a single 2 Gig Loop from the SP to the DAE. The CX300, CX500 and CX700 have two two gig loops from the SP to the DAE. (Older units will have 1 Gig loops, and at some point the new line will move to 4 Gig loops.)

While the total number of feet to travel may be longer the max potential throughput is much higher. You would have to travel miles over fiber to get latency which comes close to being where SCSI runs normally.

If your storage array is running the way that you say it is, your company should considder bringing in a storage expert from a consulting company and have them take a look at the storage.

Some things which they would look at are:
Your Raid Group may be over commited.
Your fibre may be at capacity.
Your LUN(s) may be requesting more IO than the Raid Group that it exists in can provide.
Your LUNs may not be ballanced between the SPs.
Your volumes may not have been alligned correctly when they were created.
Your LUNs may not be on the correct RAID type.

If these aren't setup correctly then you will experience extra latency and performance issues.

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]

http://www.mrdenny.com

evilDBA · Oct 8, 2007

>While the total number of feet to travel may be longer the max potential throughput is much higher.

Ha ha!
All SAN guys are always talking about *throughput*, and never - about the reactivity. For the OLTP systems we need reactivity, not the throughput.

It is like using a 18-wheeler (SAN) because it can carry heavy load and has enourmous hoursepower (throughput). While a race car can in fact, has a lower hoursepower then 18-wheeler, 18-wheeler will always eat the dust.

SAN in OLTP environment is like 18-wheeler in a stop-and-go traffic. And OLTP system is exactly stop-and-go!

What is a difference between SQL server and all other applications? While all applications write to disk asynchronously, while SQL server is writing to LOG, it do it synchronously, it is waiting for a signal 'it is ready' from a device.

Yes, I am aware that devices have their internal caches and can continue internally. But consider this.

1. I am writing a lot of data to SAN
2. I commit a small transaction in SQL server

1 and 2 will be put into the same driver queue (on NT side) to be transfered to SAN. Even SAN can put everything to disk in background, it wont report 'success' to (2) until it get all data from (1). During all that time, SQL server transaction log is locked.

mrdenny · Oct 8, 2007

Something to remember about SANs is that when you write to the disk, you aren't actually writing your data to disk. It's going into the large buffer cache and stored in memory on the SP, and is then flushed to disk. If your write cache on one SP is running full all the time then you may need to look to moving some LUNs to the other SP.

Think of a massive site like MySpace. Running hundreds of databases with half a billion transactions per minute. All of which is running on SAN storage using huge amounts of cache and huge numbers of disks shared across all the disks on the SAN (not using EMC, so the config is slightly different). (I was a DBA there and these figures are accurate.)

When SANs will also help with the stop and go traffic (and I like the analogy) is that in addition to SQL Server trying to do read aheads to cache data before you actually need it, the EMC will also prefetch data from disk into it's memory so that when the SQL Server wants to access it it's in memory and the SAN doesn't have to go to disk to get the data.

If your RAID Group is setup with 3 high load OLTP database servers odds are the RAID Group will not be able to handle the requests. When setting up SANs it's important to think not just in storage capacity, but also in IO capacity.

Let's assume that we have 3 database servers that we have to deal with. We request RAID 0+1 storage for them so the SAN guy should know that we need fast storage. He has a RAID 0+1 RAID Group with 8 disks in it. If each disk can handle 160IOs for Sequential IO, and 120IOs for random IO (don't quote me on those numbers) we should be able to handle 480 IOs per second on the RAID Group. Assuming that we have 146 Gig drives we have ~584 of storage in the group. If we request 100 Gig drives for each our SAN guy goes, "Hey those will all fit in this group". Now the problem becomes when we go to put the load on the RAID Group, we now have to put 700 IOs on our RAID Group which can only handle ~480 IOs. Our storage is now over subscribed on IOs, and the storage looks slow because it can't keep up. But that's not the fault of the storage, it's the fault of the guy who decided to put to much load on to the RAID Group.

When setting up a high volume SAN you don't get to use all the Megs available on the RAID Groups. Sometimes you have to waste space in the RAID Group in order to get more performance for the LUN.

Now, assuming our 3 LUNs have requirements like this.
Server1-250 IOs
Server2-250 IOs
Server3-200 IOs

All three servers should have there LUNs placed in separate 4 disk 0+1 RAID Groups to get the needed IO capacity. While you are waisting some space, you get the IO that you need. Some things that you can do with the left over space would be things like low IO file shares, Quorum drives for clusters, etc. Things that need some space but not a lot of IO requirements.

I'll bet that if you look at the RAID Group which has the poorly performing LUN in it you will see that the RAID Group has other LUNs in it which also need a lot of IO and you are maxed out (this is the most common cause of disk slow downs in a SAN environment).

This can all be seen in the Analyzer that comes with the EMC (you may need to purchase a license to use it). You can also look and see how far the drive heads have to travel to get to the next block of data. If they are traveling to far, you may need to redesign your physical database layout and move some objects to different file groups on different hard drives. I know that when I did this I saw a 20-30% performance increase on the system.

This gives you some benefits at a few levels.
1. You get more disk queues in Windows.
2. The SAN will have an easier time cashing the data from disk into it's memory as the data is broken out to a different disk, so it can see the difference.
3. SQL Should allocate it's space differently because the data is coming from a different file group, so data will be cached differently.

Of course all this is on top of the normal index tuning that should be done every once and a while.

Have you looked at sys.dm_io_pending_io_requests to see when IO is pending? Or the sys.dm_io_virtual_file_stats to see how much waiting for the disk is happening?

BTW, I'm not a SAN guy, but a DBA with some SAN experience.

SQL Server isn't the only software platform which uses logs. Exchange does as well, it's using the Jet engine, and most large companies use SANs for Exchange as well.

Can you elaborate more on what you are seeing and how you are diagnosing the issue when you say:

1. I am writing a lot of data to SAN
2. I commit a small transaction in SQL server

evilDBA said:
All SAN guys are always talking about *throughput*, and never - about the reactivity.

Not all SAN guys talk about throughput. That's what the marketing guys and sales guys talk about. Talk to the the engineers that design and build the units (I highly recommend a trip to EMC World next year). They talk mostly about how to get the storage to respond to the devices as quickly as possible to get on to the next request.

What EMC product do you have, and what do your RAID Groups and LUNs look like? (Maybe we can spot some issues right off the bat.)

If you have a really top notch SAN guy (which I had the benefit of having a while back) and they are willing to work with the DBA you can get some truly amazing performance out of the unit. It will however take some tuning, trial and error, and cost a little more per meg do to the wasted space, but it can be done; in just about any environment.

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]

http://www.mrdenny.com

AlexCuse · Oct 8, 2007

Fantastic post Denny!

[small]----signature below----[/small]
I can't compete with you physically, and you're no match for my brains.
You're that smart?
Let me put it this way. Have you ever heard of Plato, Aristotle, Socrates?
Yes.
Morons!

ptheriault · Oct 8, 2007

I second that Alex!

- Paul
- If at first you don't succeed, find out if the loser gets anything.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

My old laptop is faster then my PROD :) 4

evilDBA

Programmer

mrdenny

Programmer

evilDBA

Programmer

mrdenny

Programmer

jsteph

Technical User

mrdenny

Programmer

evilDBA

Programmer

mrdenny

Programmer

evilDBA

Programmer

mrdenny

Programmer

AlexCuse

Programmer

ptheriault

IS-IT--Management

Similar threads

Part and Inventory Search

Sponsor