moonshadow
Programmer
We have a 3rd party application which runs on a SQL Server box with a RAID drive. We attempted to move the database in question to a new SQL Server with all it's disk (apart from pagefile) on our NetApps SAN. One query in particular went from taking 35minutes to 2 hours 30 minutes. I have managed to create a SQL script which mimics this 3rd party query, and is entirely repeatable. I have done extensive testing and found that the query runs in about the same time on the SAN SQL Server as to the other server, if I move the database to it's local D: drive. When the DB is on the SAN, the query produces an explosion of activity for 10 minutes, and then everything on server & SAN goes quiet for the remainder of the duration (50 minutes for my query). Enterprise manager is reporting the query as waiting with a wait type of PAGEIOLATCH_SH. Using performance monitor I notice that the average latch wait time for this 2nd phase is 1000ms, with about 100 latches/sec. As I understand it a PAGEIOLATCH_SH occurs when a page of data isn't in the buffer pool (often due to insufficient RAM), and it must be requested from disk. These delays don't happen when run on local disk.
We've had a SAN Consultant in, and although he has corrected one configuration problem, and increased max throughput, the duration of the query stays the same.
I've watched the MS webcast on performance ( which is very helpful, and indicates we have an I/O subsystem issue. I've run SQLIO, and got some amazing throughput from the SAN (700M/s).
We're running on SQL 2000 Enterprise Edition SP4 on Windows 2003 Standard SP1 with 2GB RAM (clearly not enough, but if we add more, we'll just have the same problem when the database gets bigger).
We can't persuade the supplier to change their query, so I need to get to the bottom of this. I've run a short test on a similar SQL 2005 SAN server, and it ran in 10 minutes. I hope to install SQL 2005 on identical hardware as our SAN SQL Server to do a proper comparison.
Anyone any ideas where else I can look, we're completely stuck.
We've had a SAN Consultant in, and although he has corrected one configuration problem, and increased max throughput, the duration of the query stays the same.
I've watched the MS webcast on performance ( which is very helpful, and indicates we have an I/O subsystem issue. I've run SQLIO, and got some amazing throughput from the SAN (700M/s).
We're running on SQL 2000 Enterprise Edition SP4 on Windows 2003 Standard SP1 with 2GB RAM (clearly not enough, but if we add more, we'll just have the same problem when the database gets bigger).
We can't persuade the supplier to change their query, so I need to get to the bottom of this. I've run a short test on a similar SQL 2005 SAN server, and it ran in 10 minutes. I hope to install SQL 2005 on identical hardware as our SAN SQL Server to do a proper comparison.
Anyone any ideas where else I can look, we're completely stuck.