Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

JOINs multiplying upon each other in aggregates 1

Status
Not open for further replies.

OsakaWebbie

Programmer
Feb 11, 2003
628
JP
I have a main table that stores info on people and/or organizations (name, address, that sort of thing). Then I have some other tables for things like categories, events, etc. and then many-to-many relationship tables that link them. I also have a many-to-many table to link the main table to itself to show a relationship between a person and an organization he/she belongs to. Pretty standard stuff.

Users asked me to write a query to give a summary of information for a given "team" (a type of organization), including such logical things as the categories the team is in, the leader's name, the number of members, and the number of male and female members. I thought I knew how to do that - a big messy query with lots of joins, but doable. But I can't get the JOINs and GROUP BY to play together right, and data gets repeated. If I do just one of the things, it's okay, e.g.:
Code:
SELECT p.FullName TeamName, GROUP_CONCAT(cat.Category ORDER BY cat.Category SEPARATOR '\n') Categories
FROM person p
LEFT JOIN percat pc ON p.PersonID=pc.PersonID LEFT JOIN category cat ON pc.CategoryID=cat.CategoryID
WHERE p.PersonID=4787 GROUP by p.PersonID
No problem - in my test case (a team in two categories and containing 9 members), I get the name and the two categories.
And this query:
Code:
SELECT p.FullName TeamName, COUNT(poc.PersonID) Members
FROM person p
LEFT JOIN perorg poc ON p.PersonID=poc.OrgID LEFT JOIN person mc ON poc.PersonID=mc.PersonID
WHERE p.PersonID=4787 GROUP by p.PersonID
dutifully produces the result of 9.
But if I combine those two:
Code:
SELECT p.FullName TeamName, GROUP_CONCAT(cat.Category ORDER BY cat.Category SEPARATOR '\n') Categories,
COUNT(poc.PersonID) Members
FROM person p
LEFT JOIN percat pc ON p.PersonID=pc.PersonID LEFT JOIN category cat ON pc.CategoryID=cat.CategoryID
LEFT JOIN perorg poc ON p.PersonID=poc.OrgID LEFT JOIN person mc ON poc.PersonID=mc.PersonID
WHERE p.PersonID=4787 GROUP by p.PersonID
I get 9 copies of each category name and 18 members.
When I put everything I want together...
Code:
SELECT p.FullName TeamName, GROUP_CONCAT(cat.Category ORDER BY cat.Category SEPARATOR '\n') Categories,
CONCAT('Available: ',MIN(attendance.AttendDate),' - ',MAX(attendance.AttendDate)) Dates,
ml.FullName Leader, COUNT(poc.PersonID) Members, COUNT(pom.PersonID) Male,
COUNT(pof.PersonID) Female
FROM person p
LEFT JOIN percat pc ON p.PersonID=pc.PersonID LEFT JOIN category cat ON pc.CategoryID=cat.CategoryID
LEFT JOIN attendance ON p.PersonID=attendance.PersonID
LEFT JOIN perorg pol ON p.PersonID=pol.OrgID LEFT JOIN person ml ON pol.PersonID=ml.PersonID
LEFT JOIN perorg poc ON p.PersonID=poc.OrgID LEFT JOIN person mc ON poc.PersonID=mc.PersonID
LEFT JOIN perorg pom ON p.PersonID=pom.OrgID LEFT JOIN person mm ON pom.PersonID=mm.PersonID
LEFT JOIN perorg pof ON p.PersonID=pof.OrgID LEFT JOIN person mf ON pof.PersonID=mf.PersonID
WHERE p.PersonID=4787 AND attendance.EventID=7 AND pol.Leader=1 AND mm.Sex='M' AND mf.Sex='F'
GROUP by p.PersonID
...it correctly tells me things like the date range and leader's name, but anything in multiple goes wild - there are 1800 members on this team and they are all both male and female! ;-)

How do I separate these things so that I get the correct answers for each of the sets of joins instead of them compounding on each other? It occurred to me to try making views for the troublesome parts separately, but I haven't tried it (and if possible I'd rather figure out how to do it just in a query so that I can do other queries like it in the future without continually adding views to the database).
 
instead of joining to multiple tables, and then doing aggregates at the top level (outer query), you should be doing the aggregates at the subquery level, and then joining the totals to the main query
Code:
SELECT p.FullName AS TeamName
     , c.Categories
     , o.Members
  FROM person AS p
LEFT OUTER 
  JOIN ( SELECT pc.PersonID
              , GROUP_CONCAT(cat.Category 
                  ORDER BY cat.Category 
                  SEPARATOR '\n') AS Categories
           FROM percat AS pc 
         INNER
           JOIN category AS cat 
             ON cat.CategoryID = pc.CategoryID 
         GROUP
             BY pc.PersonID ) AS c
    ON c.PersonID = p.PersonID
LEFT OUTER
  JOIN ( SELECT poc.OrgID
              , COUNT(poc.PersonID) Members
           FROM perorg AS poc 
         GROUP
             BY poc.OrgID ) AS o
    ON o.OrgID = p.PersonID
 WHERE p.PersonID = 4787

r937.com | rudy.ca
Buy my new book Simply SQL from Amazon
 
That was a work of art! I didn't know one could JOIN to a whole nuther subquery (however, I am kinda new to subqueries, because until about a year ago one of the servers using my application was still running MySQL 3.23!). I learned several new techniques from your example that I can use throughout my code. :) I especially like the fact that I can isolate the JOIN/WHERE/GROUP BY completely withing the select section, because that allows a lot of flexibility when creating a query using dynamic code (like a user selecting which pieces of information he wants).

But I'm having trouble implementing it in one spot - it's probably just a dumb mistake somewhere in it, but I can't see it. For reference, the structure of my perorg table is:
Code:
CREATE TABLE `perorg` (
  `PersonID` int(11) unsigned NOT NULL,
  `OrgID` int(11) unsigned NOT NULL,
  `Leader` tinyint(1) NOT NULL default '0',
  PRIMARY KEY  (`PersonID`,`OrgID`),
  KEY `OrgID` (`OrgID`)
)
Since it is theoretically possible to have multiple leaders for a team, I decided to use the same sort of code as for categories to collect the leader name(s). Shortening my monster query to that part just for reading simplicity, I have:
Code:
SELECT p.FullName AS TeamName,
  l.Leaders
FROM person AS p
LEFT OUTER JOIN
  ( SELECT pol.PersonID, GROUP_CONCAT(pl.FullName 
    ORDER BY pl.Furigana SEPARATOR ',') AS Leaders
    FROM perorg AS pol
    INNER JOIN person AS pl ON pl.PersonID = pol.PersonID
    WHERE pol.Leader=1  
    GROUP BY pol.PersonID )
  AS l ON l.PersonID = p.PersonID
WHERE p.PersonID=4787
The leader's name does not appear. A query of the count of the leaders (like the counting of members) works fine. Where is my mistake?
 
i'm not really sure

i'll bet it has something to do with GROUP_CONCAT returning a blob

try using [blue]LEFT(GROUP_CONCAT(...),255) AS Leaders[/blue]

r937.com | rudy.ca
Buy my new book Simply SQL from Amazon
 
No change - the result is still blank. And the similar code for categories (which I took almost verbatim from you wrote) works fine without LEFT.
 
then it's gotta be either your joins are incorrect in the subquery, or your data doesn't look like what you think it does

r937.com | rudy.ca
Buy my new book Simply SQL from Amazon
 
Ah, I finally spotted the problem - I was using PersonID in a couple places where I needed OrgID, which is the connection back to the top-level record. For curious onlookers, the correct version is:
Code:
SELECT p.FullName AS TeamName,
  l.Leaders
FROM person AS p
LEFT OUTER JOIN
  ( SELECT pol.OrgID, GROUP_CONCAT(pl.FullName 
    ORDER BY pl.Furigana SEPARATOR ',') AS Leaders
    FROM perorg AS pol
    INNER JOIN person AS pl ON pl.PersonID = pol.PersonID
    WHERE pol.Leader=1  
    GROUP BY pol.OrgID )
  AS l ON l.OrgID = p.PersonID
WHERE p.PersonID=4787
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top