Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Slow Query

Status
Not open for further replies.

spyrosc

Programmer
Feb 17, 2003
4
CY
The situation is the following.

The server configuration is:
4x 2.0GHz/2M iL3,
2 ultra scsi 3 36GB Raid 1 (Drive C, D)
3 ultra scsi 3 72GB Raid 5 (Drive E)

1.5 GB Ram.

Windows 2000 Server, SQL 2000 Enterprise

The database size is 10GB and is located on Drive E.
The logfile is 5GB and is located on drive C.
The IndexFile is 5GB and is located on drive D.

Tempdb is 2GB and tempdblog =1GB

We have a table(fvalues) with 15 million records.
the structure of the table is:
Field1 Integer,Field2 Integer,Field3 Integer,Field4 Integer,Field5 Float

The 15 million records are divided into 3 periods (5 million records each). Field1 holds the period number.
Field2 is the primary key. (numbers from 1 to 15 million)

We have indexes on Field1(period) , Field2(primary key). Field1 is a clustered index. Cluster index is located on drive E.

Then we run a select statement in the query analyser:

select * from Fvalues where period=2

(It must retrieve 5 million records)

It takes around 1:20-1:50 to execute.


Do you have any sugestions why is so slow? Any sugestions to improve performance?

Thanks,

Spyros Christodoulou
 
Is PERIOD indexed?

I'm actually awaiting one of the smarter people to respond to see if I'm in the right area of guessing. 5 million records on an without an index, I would think slower than mud.

 
toetag,

checking for indexes is the first thing that should be done for slow queries (IMO), however in this case Spyros posted:

We have indexes on Field1(period)

I am interested in seeing how long the query takes after he implements SQLSister's suggestion. Which on large databases having lots of transactions may solve the problem.

For Spyros, I noticed your transaction log is 1/2 the size of your database. Are you doing backups of the transaction log? You should be, or else truncate/shrink it more often.

Also, your tempdb is getting kinda large. One way to shrink it (and the easiest IMO) is to stop the services and restart them.

-SQLBill
 
Returning 5 million rows to Query Analyzer will be slow whether you have indexes or not. Network speed, client speed as well as the SQL Server speed will all affect performance.

Are you returning to a GRID or a TEXT window in QA? Returning the rows to a grid will be many times faster than returning the rows to the text window. If returning the result to a text window, the client will ber the main cause of the slowdown because all the rows must be loaded into memory, physical or virtual. If you want to get the best answer for your question read faq183-874 and thread183-468158.
Terry L. Broadbent - DBA
SQL Server Page:
 
You can also speed things up by making sure your queries don't ask for anything more than required. Select only the columns you actually need for the task, not Select *. Do you really need all 5 million records or would only the first 1000 or so suffice? Who is going to look through 5 million records anyway? Maybe you can further break it down into more useable chunks of data by putting another qualifying field in the where clause. If you're writing this query to get a dataset you plan to do something like an update to, you might want to think about batch processing to avoid locking the records for too long. Terry has a good method around here somewhere for batch processing.
 
There are quite a few more knowledgabe individuals on this forum than me (specifically tlbroadbent and SQLSister), but I remember something written in some Access books about indexes and selectivity. Specifically, indexes on fields with a very limited range of values (i.e. true/false, 1/2/3 - as in your case) to not generate much efficiency. This may not apply to SQL Server.

Can anyone elaborate on this in more detail and/or correct my assertion?

Michael
 
Michael,

I agree. An index on the Period column will probably not be used because it is not very selective. It will probably be more efficient for SQL Server to scan the table than to scan an index on the Period column, bookmark the rows, and then get the data. This is why I recommended looking to another cause of slow performance.

Spyros,

I wonder you would want to return 5 million rows in a single query. No one will be able to review the result set. If it is being selected to create a report or for export then I recoomend using DTS. If you want to get the best answer for your question read faq183-874 and thread183-468158.
Terry L. Broadbent - DBA
SQL Server Page:
 
Thank you all for your answers.

toetag PERIOD is indexed. (It is a clustered index)

SQLSister the statistics are updated.


tlbroadbent, The select statement is executed on the SQL server-Query Analyzer to a grid.(It uses the clustered index) The reason we are using the period(field1) field is the reduce the time needed to execute the query.

If we use :
select * from fvalues where field2<=5000000
in order to select 5 million records it takes 8-9 minutes.Using field1(period) ..where period =1 we managed to reduce the time to 1:20 - 1:50 minutes

SQLBill the database is for testing and finding ways to optimise a database system that we are going to develop.
The actual amount of data we are going to have on the table is 2.5 million records per week times 104 weeks (two years of data -total 260 million records).
Most processing of data will occur for one period (2.5 million records), but there are cases that will need to get more than one periods, perhaps up to 5.(12.5 million records-very long process)
The test so far gave us results for selecting 5 million records out of 15 million. 1:20 minutes - 1:50 minutes. (Also, we are going to try the same selection out of 260 million records to see the difference)

The reason the transaction log and the tempdb are large is to give the query that will select 5 million records out of 15 million enough space for processing. Actually tempdb seems not to be used in the above query.


What are we looking for are some ways to improve the speed of the query.

Thanks again,

Spyros Christodoulou
 
With the index on Period being a clustered index, and given the fact that SQL Server is using that index, I think that SQL Server is optimized and most of the elapsed time is required to return 5 million rows to the client. Have you turned on IO statistics for the query to compare SQL Server CPU time to Elapsed time?

You will probably see the same performance when the table contains 260 million rows and you select rows for one period. SQL will perform about the same way and so will the network and client. The only way to improve performance is to reduce the amount of data transferred between the Server and client. Reducing the number of row and/or columns returned by the query can do this.
If you want to get the best answer for your question read faq183-874 and thread183-468158.
Terry L. Broadbent - DBA
SQL Server Page:
 
Terry,
I found the reason the query is so slow. (Query analyser grid is very very slow)

I tried a database having 52 periods with 5 million records each. (260 million records)

A select * INTO #xxx from _tablename where period=X gave us 0:29 seconds for getting 5 million records, and around 2:40 minutes for getting 25 million records which is excellent.

Since the final system will have queries with 2.5 million records for every period an estimated time of 15 seconds is very fast.

Also i shrinked the transaction log to 500MB and restricted its size to 500MB.



Is there a way to break a large table on various filegroups to speed up the query?

Regards,

Spyros
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top