Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Simplify many joins - how? 1

Status
Not open for further replies.

woetech

Programmer
Mar 13, 2009
26
US
Hello. I'm generating a report that could possibly have 500 rows and over 31 columns. Most of the columns represent a single date in a date range that can go up to a month. The rows represent each of the items in the DB with data for all of those dates. The way I have the SQL now can get huge and not very fast or optimized, yet I don't know how to simplify it. Here's what I mean:

Code:
SELECT a.CID, a.TID, a.Field3, a.Field4, a.Field5, locc.Field6, tmp<date1ofrange>.Balance AS '<date1ofrange>', tmp<date2ofrange>.Balance AS '<date2ofrange>', tmp<date3ofrange>.Balance AS '<date3ofrange>', ...
FROM Term a
 LEFT JOIN Business locb ON locb.CID = a.CID AND locb.BID = a.LBID
 LEFT JOIN Contact locc ON locc.CID = locb.CID AND locc.CID = locb.CID
 JOIN TStatus b on a.TID = b.TID and b.ftrxtime <= '<endofdaterange>'
 LEFT JOIN (SELECT Sdate, TID, Balance FROM TSummary WHERE CID = 10 AND Sdate = '<date1ofrange>') tmp<date1ofrange> ON a.TID = tmp<date1ofrange>.TID
 LEFT JOIN (SELECT Sdate, TID, Balance FROM TSummary WHERE CID = 10 AND Sdate = '<date2ofrange>') tmp<date2ofrange> ON a.TID = tmp<date2ofrange>.TID
 LEFT JOIN (SELECT Sdate, TID, Balance FROM TSummary WHERE CID = 10 AND Sdate = '<date3ofrange>') tmp<date3ofrange> ON a.TID = tmp<date3ofrange>.TID
 ...
 ...
WHERE a.CID = 10 AND a.TID in (10866,12818,13788,15810,16006,19682,...,...)

As you can see, I can have many more items in the SELECT clause, which is fine, but the LEFT JOINs can also go up to 31 which is just too much to me. Plus the TIDs in the WHERE clause is what can go up to 500 or something (500 numbers in the IN part). That, and if I wanted to add a column total to the very last row returned, I'd have to do a UNION ALL and basically copy the query (with some minor changes) and thus double the entire query's size. There's got to be a better way...

Any suggestions, please? Maybe a stored procedure? Maybe somehow using the ROLLUP clause for the column totals? Thanks!

I'm here to help. :)
 
What version of SQL Server? If you are using SQL2005 (or up), I would encourage you to learn about the PIVOT statement.

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
It is SQL 2005, so I guess I'll have to look into that. Thanks.

I'm here to help. :)
 
The classic SQL2000 method (that also works with all versions of SQL) would look something like this:

Code:
SELECT a.CID, a.TID, a.Field3, a.Field4, a.Field5, locc.Field6, 
       [!]Case When TSummary.SDate = '20091201' Then Balance End[/!] AS '<date1ofrange>', 
       Case When TSummary.SDate = '20091202' Then Balance End AS '<date2ofrange>', 
       Case When TSummary.SDate = '20091203' Then Balance End AS '<date3ofrange>', 
FROM Term a
 LEFT JOIN Business locb ON locb.CID = a.CID AND locb.BID = a.LBID
 LEFT JOIN Contact locc ON locc.CID = locb.CID AND locc.CID = locb.CID
 JOIN TStatus b on a.TID = b.TID and b.ftrxtime <= '<endofdaterange>'
[!] Left Join TSummary on A.TID = TSummary.TID And TSummary.CID = 10[/!]
WHERE a.CID = 10 AND a.TID in (10866,12818,13788,15810,16006,19682,...,...)

I have a couple comments about this, too.

This query assumes that there will only be one matching row in the summary table. If there are multiples, there's no guaranteeing which row the balance will come from.

Also, I think you don't need a left join on the summary table. Run it both ways. If both return the same results, use the inner join version because it is likely to perform better.


-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Thanks gmmastros, but unfortunately your method doesn't work. Instead of returning (with my example data) 51 rows, it returns 72966 rows, most are just multiples/copies it looks like and there's no data at all in the date columns. I'll have to continue to check out the PIVOT clause and see what I can do. Thanks though. :)

I'm here to help. :)
 
Try adding a date range condition to the query, like this:

Code:
SELECT a.CID, a.TID, a.Field3, a.Field4, a.Field5, locc.Field6,
       Case When TSummary.SDate = '20091201' Then Balance End AS '<date1ofrange>',
       Case When TSummary.SDate = '20091202' Then Balance End AS '<date2ofrange>',
       Case When TSummary.SDate = '20091203' Then Balance End AS '<date3ofrange>',
FROM Term a
 LEFT JOIN Business locb ON locb.CID = a.CID AND locb.BID = a.LBID
 LEFT JOIN Contact locc ON locc.CID = locb.CID AND locc.CID = locb.CID
 JOIN TStatus b on a.TID = b.TID and b.ftrxtime <= '<endofdaterange>'
Inner Join TSummary on A.TID = TSummary.TID And TSummary.CID = 10 And TSummary.SDate >= '20091201' and TSummary.SDate < '20100101'
WHERE a.CID = 10 AND a.TID in (10866,12818,13788,15810,16006,19682,...,...)

If this doesn't return the data you are looking for, then I encourage you to post some sample data from the TSummary table. I'm certain that with a little perseverance we can get this to work.

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
I will try to get some sample data though I can't just give the current as it's sensitive. Until then, here's what the report SHOULD look like:

CID TID Field3 Field4 Field5 Field6 11/01/09 11/02/09 11/03/09 11/04/09 11/05/09 11/06/09
10 10 Data1 some NULL Text1 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
10 11 Data2 NULL NULL Text2 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
10 12 Data3 NULL NULL Text3 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
10 13 Data4 NULL NULL Text4 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
10 14 Data5 NULL NULL Text5 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
10 15 Data6 NULL NULL Text6 2000.0000 2000.0000 2000.0000 2000.0000 2000.0000 2000.0000

And here's what it looks like with your query:

CID TID Field3 Field4 Field5 Field6 11/01/09 11/02/09 11/03/09 11/04/09 11/05/09 11/06/09
10 10 Data1 some NULL Text1 0.0000 NULL NULL NULL NULL NULL
10 10 Data1 some NULL Text1 NULL 0.0000 NULL NULL NULL NULL
10 10 Data1 some NULL Text1 NULL NULL 0.0000 NULL NULL NULL
10 10 Data1 some NULL Text1 NULL NULL NULL 0.0000 NULL NULL
10 10 Data1 some NULL Text1 NULL NULL NULL NULL 0.0000 NULL
10 10 Data1 some NULL Text1 NULL NULL NULL NULL NULL 0.0000
10 11 Data2 NULL NULL Text2 0.0000 NULL NULL NULL NULL NULL
10 11 Data2 NULL NULL Text2 NULL 0.0000 NULL NULL NULL NULL
10 11 Data2 NULL NULL Text2 NULL NULL 0.0000 NULL NULL NULL
10 11 Data2 NULL NULL Text2 NULL NULL NULL 0.0000 NULL NULL
10 11 Data2 NULL NULL Text2 NULL NULL NULL NULL 0.0000 NULL
10 11 Data2 NULL NULL Text2 NULL NULL NULL NULL NULL 0.0000
10 12 Data3 NULL NULL Text3 0.0000 NULL NULL NULL NULL NULL
10 12 Data3 NULL NULL Text3 NULL 0.0000 NULL NULL NULL NULL
10 12 Data3 NULL NULL Text3 NULL NULL 0.0000 NULL NULL NULL
10 12 Data3 NULL NULL Text3 NULL NULL NULL 0.0000 NULL NULL
10 12 Data3 NULL NULL Text3 NULL NULL NULL NULL 0.0000 NULL
10 12 Data3 NULL NULL Text3 NULL NULL NULL NULL NULL 0.0000
10 13 Data4 NULL NULL Text4 0.0000 NULL NULL NULL NULL NULL
10 13 Data4 NULL NULL Text4 NULL 0.0000 NULL NULL NULL NULL
10 13 Data4 NULL NULL Text4 NULL NULL 0.0000 NULL NULL NULL
10 13 Data4 NULL NULL Text4 NULL NULL NULL 0.0000 NULL NULL
10 13 Data4 NULL NULL Text4 NULL NULL NULL NULL 0.0000 NULL
10 13 Data4 NULL NULL Text4 NULL NULL NULL NULL NULL 0.0000
10 14 Data5 NULL NULL Text5 0.0000 NULL NULL NULL NULL NULL
10 14 Data5 NULL NULL Text5 NULL 0.0000 NULL NULL NULL NULL
10 14 Data5 NULL NULL Text5 NULL NULL 0.0000 NULL NULL NULL
10 14 Data5 NULL NULL Text5 NULL NULL NULL 0.0000 NULL NULL
10 14 Data5 NULL NULL Text5 NULL NULL NULL NULL 0.0000 NULL
10 14 Data5 NULL NULL Text5 NULL NULL NULL NULL NULL 0.0000
10 15 Data6 NULL NULL Text6 2000.0000 NULL NULL NULL NULL NULL
10 15 Data6 NULL NULL Text6 NULL 2000.0000 NULL NULL NULL NULL
10 15 Data6 NULL NULL Text6 NULL NULL 2000.0000 NULL NULL NULL
10 15 Data6 NULL NULL Text6 NULL NULL NULL 2000.0000 NULL NULL
10 15 Data6 NULL NULL Text6 NULL NULL NULL NULL 2000.0000 NULL
10 15 Data6 NULL NULL Text6 NULL NULL NULL NULL NULL 2000.0000

Hopefully that helps (as formatting may be off after posting) a bit and thanks so much!

I'm here to help. :)
 
Thus, it seems, whatever the date range is, the amount of days within, is the amount of extra rows per TID that I get. I'm sure it's something simple I'm missing, but any further help would be much appreciated. :)

I'm here to help. :)
 
I can see that we're getting close. Try this:

Code:
SELECT a.CID, a.TID, a.Field3, a.Field4, a.Field5, locc.Field6,
       Min(Case When TSummary.SDate = '20091201' Then Balance End) AS '<date1ofrange>',
       Min(Case When TSummary.SDate = '20091202' Then Balance End) AS '<date2ofrange>',
       Min(Case When TSummary.SDate = '20091203' Then Balance End) AS '<date3ofrange>',
FROM Term a
 LEFT JOIN Business locb ON locb.CID = a.CID AND locb.BID = a.LBID
 LEFT JOIN Contact locc ON locc.CID = locb.CID AND locc.CID = locb.CID
 JOIN TStatus b on a.TID = b.TID and b.ftrxtime <= '<endofdaterange>'
Inner Join TSummary on A.TID = TSummary.TID And TSummary.CID = 10 And TSummary.SDate >= '20091201' and TSummary.SDate < '20100101'
WHERE a.CID = 10 AND a.TID in (10866,12818,13788,15810,16006,19682,...,...)
Group By a.CID, a.TID, a.Field3, a.Field4, a.Field5, locc.Field6


-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
I don't know why I didn't think of that, but that seems to do the trick! Though, I'm wondering if the MIN function will ever skew the results in the SELECT... I don't think it will, but will research it to be sure. Thanks so much!! :D

I'm here to help. :)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top