Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

MySQL 5.0: multiple tables "joined" in where clause bad? 1

Status
Not open for further replies.

OsakaWebbie

Programmer
Feb 11, 2003
628
JP
I'm moving a custom DB application from a hoster running MySQL 3.23 to a new hoster running MySQL 5.0. I am fully expecting syntax that needs fixing, but the first error I ran into I don't understand. Here is the SQL:
Code:
SELECT person.PersonID, FullName, Furigana, Email, CellPhone, Phone, TempAddress, postalcode.*, Address, person.Photo FROM person, percat
LEFT JOIN household ON person.HouseholdID=household.HouseholdID
LEFT JOIN postalcode ON household.PostalCode=postalcode.PostalCode
WHERE person.PersonID=percat.PersonID AND CategoryID = 56
ORDER BY Furigana
I get: "SQL Error 1054: Unknown column 'person.HouseholdID' in 'on clause'".

My person table does have a HouseholdID column, as it always did. The only difference between this and similar, working queries is the inclusion of the percat table and a WHERE statement to functionally "join" it to the person table, and the WHERE statement for the column CategoryID (in table percat - I'm asking for a list of people in a given category; percat makes a many-many relationship to a category table). The reason I did the query as multiple tables and a WHERE rather than an INNER JOIN has to do with the flow of the complex code that builds the SQL statement (possible criteria vary dramatically based on user input), but I suppose I could find another way to build the statement that makes the person-percat relationship an INNER JOIN. But this worked in 3.23 - can someone tell me why it doesn't in 5.0?
 
You have to join the Percat table as an ANSI join

SELECT person.PersonID, FullName, Furigana, Email, CellPhone, Phone, TempAddress, postalcode.*, Address, person.Photo FROM person
inner join percat on
(person.PersonID=percat.PersonID AND CategoryID = 56)
LEFT JOIN household ON person.HouseholdID=household.HouseholdID
LEFT JOIN postalcode ON household.PostalCode=postalcode.PostalCode
ORDER BY Furigana

Ian
 
Yup, that's what I thought - I was just curious why the other kind (which is apparently called a "theta join") is no longer allowed, and why MySQL would complain about it by saying that a totally unrelated column doesn't exist.

(I had to look up your terminology, though - I often know how to code but don't know how to "talk about it". (-: The word "ANSI" makes me cautious, as it is often used to mean single byte character handling, which is unacceptable here in Japan.)

It hadn't occurred to me to put the CategoryID=<number> in the join itself. Is that better for processing speed or something? The way I think through query statements, that would seem like a WHERE criteria.
 
I guess its more efficient, I am not a DBA so not sure.

Thats just the way I have always joined tables using the ANSI syntax.

Joining tables in the Where clause is the classic Oracle syntax. Though oracle can now use ANSI join syntax too.

Ian
 
When I looked up the meaning of "ANSI join", it seemed to only refer to the column-to-column part that links the two tables together (what you call "classic Oracle" someone else called a "theta join", for some reason). But what I meant by my last question was not the join, but the conditional for what records to return. For example:
Code:
select a.* from a inner join b on a.col1=b.col1 where b.col2='value'
versus
Code:
select a.* from a inner join b on (a.col1=b.col1 and b.col2='value')
I'm not a DBA, either - anyone have any ideas on which of these is better? If it matters, assume col1 is not indexed in table a, but both col1 and col2 are indexed in table b.
 
I am guessing but for an inner join there is probably no difference.

However, for a left outer it allows you to place a condition of the data from the outer table, without compromising the Left outer join.

Ian
 
However, for a left outer it allows you to place a condition of the data from the outer table, without compromising the Left outer join.
Interesting... that's actually a different capability altogether, if I'm imagining the ramifications correctly. The idea of a left join is that the left table's data is always returned, even if the matching row in the right table isn't there. If I add a condition, does that mean that the left table's data is still returned but less of the fields from the right table are filled? Or does it simply reduce the number of returned rows overall, like the where clause?
 
For what it's worth, your original problem seems to be order of evaluation. In your FROM clause, try doing this instead:
Code:
... FROM [COLOR=red]([/color]person, percat[COLOR=red])[/color]
LEFT JOIN household ON person.HouseholdID=household.HouseholdID
It looks to me like MySQL is giving precedence to the LEFT JOIN rather than the comma join. So instead of doing a left join on the join of person and percat, it's left-joining percat and household, which fails in the ON clause because person isn't one of the join tables.

Incidentally, ANSI just stands for "American National Standards Institute". They publish standards related to everything from medical devices to programming languages. The term "ANSI join" simply means a join that conforms to the syntax specified in the ANSI SQL standard, as opposed to one that uses a vendor-specific syntax. The usage here is actually a little misleading, because your original JOIN syntax is perfectly valid ANSI SQL-92. Your semantics are just off - on my (admittedly cursory) reading of the standard, it looks like the comma join is supposed to work the way you're seeing now.

OsakaWebbie said:
If I add a condition, does that mean that the left table's data is still returned but less of the fields from the right table are filled?
That's basically correct. In the examples you posted, the first one, with the WHERE clause, will only return rows with b.col2='value', so it will return fewer total rows. The second one will filter out the rows of table b that don't have col2='value', so you'll get more total rows, but some of them will have NULLs in the columns for table b.
 
... FROM (person, percat)...
Ah, that fits the error message - MySQL hasn't gone completely loopy! And it's great news regarding how little change I can get away with to make this application work in 5.0 - yesterday I figured out how to change the comma join to an inner join in the original instance that was giving me trouble, but this is a big complex app, so I'm sure there will be others that will come up to bite me as I continue to test... (insert theme music from Jaws here)

In the examples you posted, the first one, with the WHERE clause, will only return rows with b.col2='value', so it will return fewer total rows. The second one will filter out the rows of table b that don't have col2='value', so you'll get more total rows, but some of them will have NULLs in the columns for table b.
I'm a little confused. My pair of little examples used an inner join, not a left join. I would have thought for an inner join the results would be the same (limit the number of rows returned) but there might be a performance difference. If the results are different, I would definitely not use that second example (I haven't so far, but if someone had said it was more efficient, I might have). Or were you simply meaning that if those examples had been left joins (which was essentially what Ian and I were talking about in our most recent posts, but I didn't rewrite the examples) it would behave that way?
 
Sorry, yes, I was talking about using those join conditions in a LEFT join. With an inner join, the result set would be the same.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top