Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations John Tel on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

how to optimize query

Status
Not open for further replies.

inspi

Technical User
May 24, 2006
50
US
I have this Query. I need a way to optimize this.


SELECT DISTINCT /*First SQL*/
SK.key AS SECURE,
sl.col1 AS Name,
sl.col2 AS Value,
SK.log AS KEYLOG,
SK.time,
(
SELECT max(L.log)
FROM dbo.LogTable AS L
WHERE L.key = SK.key AND L.time < SK.time
) AS PREVIOUS
INTO #temp_prev_both
FROM dbo.LogkeyTbl AS SK, dbo.LogTable AS sl
WHERE
SK.log = sl.log AND
SK.time >= ’11-14-2007 12:00:00 AM’ AND
SK.time <= ’11-14-2007 11:59:59 PM’ AND
SK.time =
(
SELECT max(S.time)
FROM dbo. LogkeyTbl AS S, dbo. LogTable AS sl
WHERE
S.log = sl.log AND
S.Key = SK.Key AND
S.time >= ’11-14-2007 12:00:00 AM’ AND
S.time <= ’11-14-2007 11:59:59 PM’
GROUP BY S.Key
)
ORDER BY 1

SELECT * /*Second SQL*/
FROM #temp_prev_both t left outer join LogTable sl ON t.Previous = sl.log and t.Name = sl.Name

First SQL command will dump all the records to a temp table. There are approx 400K records.

Second SQL is taking 7 mins to run.

Is there any way to reduce the time of running?
 
There are many things you can do to optimize this.

First, can you explain the last subquery.

[tt]
SK.time =
(
SELECT max(S.time)
FROM dbo. LogkeyTbl AS S, dbo. LogTable AS sl
WHERE
S.log = sl.log AND
S.Key = SK.Key AND
S.time >= ’11-14-2007 12:00:00 AM’ AND
S.time <= ’11-14-2007 11:59:59 PM’
GROUP BY S.Key
) [/tt]

This subquery appears to be using 2 tables, but I think you only need one of them (LogKeyTable).

Do you get the correct results if you use this instead?

[tt]
SK.time =
(
SELECT max(S.time)
FROM dbo.LogkeyTbl AS S
WHERE
S.Key = SK.Key AND
S.time >= '11-14-2007 12:00:00 AM' AND
S.time <= '11-14-2007 11:59:59 PM'
GROUP BY S.Key
) [/tt]



-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Yes, I'm getting same set of results.
 
that's good. It probably ran ever so slightly faster (but probably not much). Now, for the next step, let's convert the sub-query to a derived table.

Code:
SELECT DISTINCT  /*First SQL*/
         SK.[key] AS SECURE,  
         sl.col1 AS Name,  
         sl.col2 AS Value,  
         SK.log AS KEYLOG,  
         SK.time,  
         (
               SELECT max(L.log)
               FROM dbo.LogTable AS L
               WHERE L.[key] = SK.[key] AND L.time < SK.time
         ) AS PREVIOUS        
      INTO #temp_prev_both
      FROM dbo.LogkeyTbl AS SK
           Inner Join dbo.LogTable  AS sl
             On  sk.log = sl.log
             And SK.time >= '11-14-2007 12:00:00 AM'
             And SK.time <= '11-14-2007 11:59:59 PM'    
           Inner join (
                SELECT [Key], max(S.time) As MaxTime
                FROM dbo.LogkeyTbl AS S
                WHERE  
                   S.time >= '11-14-2007 12:00:00 AM' AND  
                   S.time <= '11-14-2007 11:59:59 PM'
                GROUP BY S.[Key]
             ) As AliasName
             On SK.[key] = AliasName.[Key]
             And SK.time = AliasName.MaxTime
      ORDER BY 1

I have a couple more thoughts, but it's best to take this in steps, ya know?

Does this produce the same results? Is it faster? How long does it take to run?


-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Never use select * especially when you have a join becasue the join field(s) will be repeated. Always specify the minimum no of columns you need.

Consider your indexing. If your temp table returns a lot of resluts you might want to index it on the join field. And make sure the table you are joining to have good indexes.

Try using a table variable. It may improve performance (it also may not, so you just have to try it).

In the first query you could use joins instead of subqueries. They are usually faster.

And please stop using that old join syntax (not for performance reasons but it is difficult to see what are the jon conditions and waht are the where conditions. It is much easier to optimize when what we have to look at is clear.)

"NOTHING is more important in a database than integrity." ESquared
 
There is no significant change in the time it is taking to run. It was same a before...around 20 secs.

But the result set is same
 
I completely missed that part. The first SQL run in 20 seconds, but the second SQL takes 7 minutes?

I don't think you need to use a temp table here, so it's possible that a derived table will be faster. Try this...

Code:
Select *
From   (
       SELECT DISTINCT  /*First SQL*/
         SK.[key] AS SECURE,  
         sl.col1 AS Name,  
         sl.col2 AS Value,  
         SK.log AS KEYLOG,  
         SK.time,  
         (
               SELECT max(L.log)
               FROM dbo.LogTable AS L
               WHERE L.[key] = SK.[key] AND L.time < SK.time
         ) AS PREVIOUS        
       FROM dbo.LogkeyTbl AS SK
           Inner Join dbo.LogTable  AS sl
             On  sk.log = sl.log
             And SK.time >= '11-14-2007 12:00:00 AM'
             And SK.time <= '11-14-2007 11:59:59 PM'    
           Inner join (
                SELECT [Key], max(S.time) As MaxTime
                FROM dbo.LogkeyTbl AS S
                WHERE  
                   S.time >= '11-14-2007 12:00:00 AM' AND  
                   S.time <= '11-14-2007 11:59:59 PM'
                GROUP BY S.[Key]
             ) As AliasName
             On SK.[key] = AliasName.[Key]
             And SK.time = AliasName.MaxTime
      )  As t
      left outer join LogTable sl 
         ON t.Previous = sl.log and t.Name = sl.Name
ORDER BY Secure



-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
you have a problem with this line
<= '11-14-2007 11:59:59 PM'

run this

select convert(datetime,'11-14-2007 11:59:59 PM' )

the output is this

2007-11-14 23:59:59.000

if this is a datetime column and not a smalldatetime then you will miss rows which were inserted between
2007-11-14 23:59:59.000

and 2007-11-15 00:00:00.000


Denis The SQL Menace
--------------------
SQL Server Code,Tips and Tricks, Performance Tuning
SQLBlog.com, Google Interview Questions
 
Do what Denis suggests. It won't improve the speed, but it will improve the accuracy.

I assume you are using Query Analyzer to test the query. What I want you to do is press CTRL-K (CTRL-M for SQL Server Management Studio). It will appear as though nothing has happened. Now, run the query again. After the query is done running, you will see a new tab labeled 'Actual Execution Plan'. Click on this tab and look for table scans. Do you see any?

If you have table scans, this implies that indexes are either not present, or not being used. So, to further improve the execution time, you can add indexes. Let us know what you find.



-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
and don't forget to get rid of that select *. Never ever selct more columns than you need and never use select * when you have a join because by definition you are selecting more columns than you need.

"NOTHING is more important in a database than integrity." ESquared
 
I found 2 table scans..actually both the tables that I used doesnt have indexes on them.
 
Well... there's your problem. You should put indexes on these tables. Actually, since you are getting table scans, this implies that you don't even have primary keys set up on these tables.

Do you know what a primary key is?



-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Yes, I know what a primary key is. The table itself is so complicated that there was no possibility to add primary key to it.
 
Well, adding an index will certainly speed up this query. Index tuning is tricky (for me at least).

My best advice is to try different indexes to see which gives you the best performance. I recommend you start with the 'Index Tuning Wizard'. Make sure you have this query loaded in a window, and then click 'Query' -> 'Index Tuning Wizard'.

Run through the wizard and try implementing the suggestions that it makes. You need to be a little careful with indexes. Too many indexes on a table will slow down other queries (insert, updates, and deletes). Of course, not having any indexes will slow down select queries. Index tuning is a balancing act. You don't want too many, but you need some.

Hope this helps.



-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Indexes are critical to performance. At a minimum you will need indexes on the fields you use to join to other tables or that you often use in where clauses. Indexes are a balancing act between performance on insert/update/delete and performance in selects. They slow down the first and speed up the second. At this point you probably cannot make much more headway in speeding up until you add appropriate indexes.

Unique indexes (including primary keys) are usually critical to data integrity. I do not believe there is such a thing as a table too complicated for a primary key (you can always have an autogenerated key called an identity field or a guid). I believe that the more complex the data, the more critical it is to have something that you can use to uniquely identify a record. Anytime I have not had one I later came to regret it, therfore I never create a table without a primary key.

"NOTHING is more important in a database than integrity." ESquared
 
Oh yeah... one more thing.

After you create the indexes, please let us know the query time. I'm curious to see how much difference indexes will make.



-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top