Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Should I use JOIN or WHERE to join tables?

T-SQL Hints and Tips

Should I use JOIN or WHERE to join tables?

by  ESquared  Posted    (Edited  )
Consider the two methods...

WHERE
[tt]SELECT Patients.Name, Doctors.Specialty
FROM Patients, Doctors
WHERE Patients.DoctorID = Doctors.DoctorID[/tt]

JOIN
[tt]SELECT Patients.Name, Doctors.Specialty
FROM Patients
INNER JOIN Doctors ON Patients.DoctorID = Doctors.DoctorID[/tt]

They look nearly identical. So which one should you use?

Well to start with, the two methods may be functionally identical in many cases because the database engine's query optimizer is smart enough to see that the tables must be joined using said criteria with either method.

I did some testing on a table in one of my databases that has 1.6 million records and one that has 620k records, and the query optimizer chose the same execution plan for both methods (JOIN and WHERE), even with joining three copies of the same table.

So I searched for help online to find out if there is one I should use. What I found is that JOIN is the preferred method, for several reasons:

- It is the 'correct' way to join tables, at least according to ANSI standards. Newer databases in general support JOIN, older databases in general do not.

- In some databases you can't do LEFT or RIGHT joins with WHERE clauses, so you'd have to rewrite your query. In databases that perform left-join operations using special WHERE syntax, queries sometimes have to be significantly rewritten to function with JOIN syntax. See [link http://www.databasejournal.com/features/mssql/article.php/1438001]ANSI Joins by Neil Boyle[/link].

Using the example queries from above, what if you wanted to show patients in your list, even if they did not have a doctor assigned to them (DoctorID = Null)? You must use a LEFT JOIN:

[tt]SELECT Patients.Name, Doctors.Specialty FROM Patients LEFT JOIN Doctors ON Patients.DoctorID = Doctors.DoctorID[/tt]

Patients with no assigned doctor will show Null in the Specialty field, while those with an assignment will show the Specialty as usual. The example query above only shows patients who have a doctor assigned.

- Using JOIN separates the conditions for the query selection from the conditions for the table join. This simplifies the WHERE clause and allows you to more quickly scan it for the conditions in a query.

- Queries to find unmatched rows or 'orphans' using NOT IN or NOT EXISTS with a WHERE clause are less efficient than using JOIN. To find all doctors who are not assigned to a patient:

[tt]SELECT Doctors.Name FROM Doctors LEFT JOIN Patients ON Doctors.DoctorID = Patients.DoctorID WHERE Patients.DoctorID IS NULL[/tt]

In databases that support JOIN, I don't think there is a way to do such an unmatched query using a WHERE clause.

- Placing a condition in the JOIN clause instead of the WHERE clause can be of strategic importance. Read this link if you read no others in this FAQ! [link http://www.sqlteam.com/item.asp?ItemID=11122]Additional Criteria in the JOIN Clause by AjarnMark[/link]

I learned something from this page that will help me in many queries.

- My personal opinion is that JOIN may be superior because at some point, with more and more complicated queries, the optimizer could end up with some inefficiences..

-E¦


-----------------
And some more resources:

[link http://www.thedbcommunity.com/news/client_server/589.htm]SQL Syntax at the Paradox Community[/link]

[link http://www.sql-server-performance.com/tuning_joins.asp]Performance Tuning SQL Server Joins at SQL-Server-Performance.Com[/link] (excerpt as follows:)
When you create joins using Transact-SQL, you can choose between two different types of syntax: either ANSI or Microsoft. ANSI refers to the ANSI standard for writing joins, and Microsoft refers to the old Microsoft style of writing joins. For example:

ANSI JOIN Syntax

SELECT fname, lname, department
FROM names INNER JOIN departments ON names.employeeid = departments.employeeid

Former Microsoft JOIN Syntax

SELECT fname, lname, department
FROM names, departments
WHERE names.employeeid = departments.employeeid

If written correctly, either format will produce identical results. But that is a big if. The older Microsoft join syntax lends itself to mistakes because the syntax is a little less obvious. On the other hand, the ANSI syntax is very explicit and there is no chance you can make a mistake.

For example, I ran across a slow-performing query from an ERP program. After reviewing the code, which used the Microsoft JOIN syntax, I noticed that instead of creating a LEFT JOIN, the developer had accidentally created a CROSS JOIN instead. In this particular example, less than 10,000 rows should have resulted from the LEFT JOIN, but because a CROSS JOIN was used, over 11 million rows were returned instead. Then the developer used a SELECT DISTINCT to get rid of all the unnecessary rows created by the CROSS JOIN. As you can guess, this made for a very lengthy query. Unfortunately, all I could do was notify the vendor's support department about it.

The moral of this story is that you probably should be using the ANSI syntax, not the old Microsoft syntax. Besides reducing the odds of making silly mistakes, this code is more portable between database, and eventually, I imagine Microsoft will eventually stop supporting the old format, making the ANSI syntax the only option.
Register to rate this FAQ  : BAD 1 2 3 4 5 6 7 8 9 10 GOOD
Please Note: 1 is Bad, 10 is Good :-)

Part and Inventory Search

Back
Top