Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

The performance of which subquery is more efficient?

Status
Not open for further replies.

adam0101

Programmer
Jun 25, 2002
1,952
US
Assume a task can have many actions. Now consider these three queries:
Code:
---------Query 1-----------
SELECT
  tblTasks.TaskID,
  ( SELECT Min(ActionDate)
    FROM   tblActions 
    WHERE  tblActions.TaskID=tblTasks.TaskID
  ) AS FirstActionDate,
  ( SELECT Max(ActionDate)
    FROM   tblActions 
    WHERE  tblActions.TaskID=tblTasks.TaskID
  ) AS LastActionDate
FROM
  tblTasks

--------Query 2-----------
SELECT
  tblTasks.TaskID,
  FirstAction.dt AS FirstActionDate,
  LastAction.dt AS LastActionDate
FROM
  tblTasks
LEFT OUTER JOIN
  ( SELECT   Min(ActionDate) As dt,
             tblActions.TaskID
    FROM     tblActions
    GROUP BY tblActions.TaskID
  ) AS       FirstAction
ON tblTasks.TaskID=FirstAction.TaskID
LEFT OUTER JOIN
  ( SELECT   Max(ActionDate) As dt,
             tblActions.TaskID
    FROM     tblActions 
    GROUP BY tblActions.TaskID
  ) AS       LastAction
ON tblTasks.TaskID=LastAction.TaskID

-----------Query 3-------------
SELECT
  tblTasks.TaskID,
  Min(FirstAction.ActionDate) AS FirstActionDate,
  Max(LastAction.ActionDate) AS LastActionDate
FROM
  tblTasks
LEFT OUTER JOIN
  tblActions FirstAction
ON tblTasks.TaskID=FirstAction.TaskID
LEFT OUTER JOIN
  tblActions LastAction
ON tblTasks.TaskID=LastAction.TaskID
GROUP BY tblTasks.TaskID

Now if I wrote these correctly, all three queries should return the same result set. So my question is:
1. Which query is most efficient (fastest) and
2. Under what condition would it be better to use the other two queries? Adam
 
I would guess that query 2 would be fastest but there are too many unknowns to make an accurate guess. What database are you using? What are the indexes? How many records exist in the table?

I would use the following query instead of any of your proposed queries.

SELECT
tblTasks.TaskID,
Min(tblActions.ActionDate) AS FirstActionDate,
Max(tblActions.ActionDate) AS LastActionDate
FROM tblTasks
LEFT OUTER JOIN tblActions
ON tblTasks.TaskID=tblActions.TaskID Terry L. Broadbent - DBA
Computing Links:
faq183-874 contains "Suggestions for Getting Quick and Appropriate Answers" to your questions.
 
Adam, this is not really directed at you although I think you should read it. It is directed at this type of question.


Just to put my two cents in since I am feeling cranky today.

This kind of question comes up all the time and the real answer is.

TEST THEM.

The variations of what is going on are just too numerous to even list although Terry made a good start.

There is really no substitute for benchmarking and using what ever sql optimization tools are available on your particular database.

Second, the other questions you should ask yourself before even worrying about it is:

Is this a slow query? Your definition of slow may vary.

Does it get run often or once a year or sometime in between?

As in most things in life optimizing usually follows the 90/10 rule 10% of the code, queries etc. are taking 90% of the time.

One of the most common Jr. programmer mistakes is spending large amounts of time optimizing some piece of code that is either rarely if ever run or already is fast enough.

(Again your definition of fast enough will vary.)

While some people may say 'but all code should be optimized' in real life you don't have time.

Also some optimized code is what I call 'clever' code, impossible to figute out at first glance and even harder to maintain, when this is done to save a second or so, it is almost never worth time time it took or the time it will cost you in the future.

So to sum up. Figure out what code, queries etc. are actually causing problems. Test them. Use the tools available. AND When you have done those things feel free to postany questions you may have at that point.
 
If I may add to this, expect to have to tune them in your production environment. The size of the table will affect the best choices. Table contention will affect the right choices.


(you did this part very well but it bears repeating) First write correct code, then worry about fast code if it is slow.

If you are writing SQL for a product that runs at many sites, expect each site to find a different part of your code slow. I tried to remain child-like, all I acheived was childish.
 
Thanks for your comments. I was hoping there were some rules you follow to choose which technique to use. I would rather build it right the first time than blindly build a query that needs to be reworked because it's too slow. I work on a huge database on SQL Server. Each table has probably 100,000+ records. Every tweak can make a big difference. Are there techniques that should be avoided? For example, I heard that subqueries placed in the SELECT clause are bad because they get executed for each row. Does this mean this technique should never be used or only for a certain situation? I guess my real question is: What are the pros and cons of different types of subqueries? Adam
 
search google for "sargable"

not sure if that's microsoft sql server terminology or not, but it should lead you in the right direction

i've heard people say is worth a visit (i haven't needed it myself, since i don't work on sql server databases)

also, learn how to run EXPLAIN PLAN (is that the right microsoft term?) and take action based on what it tells you about your queries

rudy
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top