Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

GROUP BY behavior - how does it summarize? 1

Status
Not open for further replies.

VBAjedi

Programmer
Dec 12, 2002
1,197
KH
I haven't found the answer to this in the mySQL help files or in my other usual sources. . . it essentially is a question of exactly how GROUP BY treats columns with numbers in them.

I have created a query that returns the following cols:

SalesmanID (number)
SalesmanName (text)
MonthlyTarget (number)
JanSales (number)
FebSales (number)
. . .
DecSales (number)

Due to a tables limitation, it returns each month's sales amount for a given SalesmanID as a seperate line/record. I'd like to use GROUP BY SalesmanID to roll all 12 lines up into one line for each SalesmanID.

How do I tell GROUP BY to SUM the numbers in JanSales, FebSales, etc for all entries with a given SalesmanID? Also, how do I tell it NOT to SUM the numbers in other columns (like MonthlyTarget), and what happens if the numbers in these columns are different on some of the relevant entries?

Thanks for any insight/resources you can offer!

VBAjedi [swords]
 
> Due to a tables limitation, it returns each month's sales amount for a given SalesmanID as a seperate line/record

i'd like to see why

> How do I tell GROUP BY to SUM the numbers in JanSales, FebSales, etc for all entries with a given SalesmanID?

[tt]select SalesmanID
, sum(JanSales) as SumJanSales
, sum(FebSales) as SumFebSales
...
, sum(DecSales) as SumDecSales
from thetable
group
by SalesmanID[/tt]

> Also, how do I tell it NOT to SUM the numbers in other columns (like MonthlyTarget), and what happens if the numbers in these columns are different on some of the relevant entries?

just leave them out


rudy
 
Rudy,

Thanks for the reply and suggestions!

We discussed this particularly poorly constructed set of tables in thread436-590374 - where you said my options were using a scripting language (not an option yet), or doing a 12-table join. At this point, I'm almost ready to try the join, but I wanted to try the approach I refer to in this thread first (return values as seperate records, then use "Group By" to roll them up into one returned record for each record in the Sales table).

I just don't understand exactly how "Group By" handles numbers. I have some columns that I want to sum, and some that I just want a single value out of (possibly using MAX?). Leaving them out won't work. . .

I'm having problems with the SUM function too because the values I want to SUM are calculated (not just a field). Again this is better described in my other post, but basically I have to test each record in a secondary table to see if a date value matches a specified year/month, and put a value in that periods column if so. The query just won't run if the following is in it:

SUM(IF(CONCAT(Year(D.D_DOLLARDISDATE), Month(D.D_DOLLARDISDATE)) = CONCAT(Year(NOW()), 1), D.N_DOLLARVALUE, null)) as January,

Thanks again for any suggestions you can offer!

Luke



VBAjedi [swords]
 
i am, like, totally confused

could you please show the table layouts for the tables you're starting with

mysql, unlike access, does not let you query a query (mysql does not support views)

then show me a sample of the result set you want

rudy
 
Rudy,

Don't worry - I'm a bit confused myself! My first time using SQL, and it had to be this... this... THING!

I've made progress - I seem to have my "Group By" SUM's working. Now I need to figure out how to create a "Total Pmts" column after 12 monthly amount columns. I think I'm having trouble because they are calculated columns themselves, or because I'm using "Group By" in conjunction with "Sum". Ok, this will be a long post (my complete query is at the end). The two relevant tables are:

1) Sales. The relevant information about the Sales table to this particular issue is that my resultset needs to have one (and only one) line for each line in Sales (i.e. it is Left Outer Join'ed to my other tables - seven in all). This tables P.I. is PI_SalesID.
2) Distribution. It is NOT joined to anything - a WHERE clause matches up this table's FI_SALESID with PI_SALESID. It is described as follows:
Code:
Field            Type     Null             
---------------  -------  ------   
PI_DISID         int(11)         Table Key   
FI_SALESID       int(11)  YES    Foreign Index
D_DISDATE        date     YES    Not relevant
N_DOLLARVALUE    int(11)  YES    Installment amt.
N_UNITVALUE      int(11)  YES    Not relevant
D_DOLLARDISDATE  date     YES    Date pmt. due

Each record in the Distribution table represents one installment of a payment plan for one sale (PI_SALESID), so there might be 12, 18 or more records with the same FI_SALESID. One of my challenges was to generate, for each sale, a schedule of planned payments for whatever 12-month (December-November) fiscal period we are currently in (if any of the 12 payments fall within that period), and put this schedule in the 12 rightmost columns for each resultset item.

Here's my total query so far (including some stuff not relevant to my question):

SELECT
IF(LENGTH(I3.PI_IDENTID) > 0, I3.V_IDENTNAME, IF(LENGTH(I2.PI_IDENTID) > 0, I2.V_IDENTNAME, I1.V_IDENTNAME)) as Level1,
IF(LENGTH(I3.PI_IDENTID) > 0, I2.V_IDENTNAME, IF(LENGTH(I2.PI_IDENTID) > 0, I1.V_IDENTNAME, “”)) as Level2,
IF(LENGTH(I3.PI_IDENTID) > 0, I1.V_IDENTNAME, “”) as Level3,
S.V_OWNER as Owner,
F2.V_TEXT AS ContractType,
S.V_RATING AS Stage,
A.V_COMPANY AS AccountName,
S.V_PRODUCT AS Product,
S.V_TERM AS Term,
F.V_TEXT AS ContractNum,
S.V_UNITPERIOD AS NumberOfMonths,
S.D_STARTDATEDOLLAR AS StartDate,
S.D_EXPIRYDATE AS ExpirationDate,
S.N_UNITVALUE AS NumUnits,
SUM(IF(CONCAT(Year(D.D_DOLLARDISDATE), Month(D.D_DOLLARDISDATE)) =
CONCAT(Year(NOW()) - 1, 12), D.N_DOLLARVALUE, null)) as December,
SUM(IF(CONCAT(Year(D.D_DOLLARDISDATE), Month(D.D_DOLLARDISDATE)) =
CONCAT(Year(NOW()), 1), D.N_DOLLARVALUE, null)) as January,
SUM(IF(CONCAT(Year(D.D_DOLLARDISDATE), Month(D.D_DOLLARDISDATE)) =
CONCAT(Year(NOW()), 2), D.N_DOLLARVALUE, null)) as February,
SUM(IF(CONCAT(Year(D.D_DOLLARDISDATE), Month(D.D_DOLLARDISDATE)) =
CONCAT(Year(NOW()), 3), D.N_DOLLARVALUE, null)) as March,
SUM(IF(CONCAT(Year(D.D_DOLLARDISDATE), Month(D.D_DOLLARDISDATE)) =
CONCAT(Year(NOW()), 4), D.N_DOLLARVALUE, null)) as April,
SUM(IF(CONCAT(Year(D.D_DOLLARDISDATE), Month(D.D_DOLLARDISDATE)) =
CONCAT(Year(NOW()), 5), D.N_DOLLARVALUE, null)) as May,
SUM(IF(CONCAT(Year(D.D_DOLLARDISDATE), Month(D.D_DOLLARDISDATE)) =
CONCAT(Year(NOW()), 6), D.N_DOLLARVALUE, null)) as June,
SUM(IF(CONCAT(Year(D.D_DOLLARDISDATE), Month(D.D_DOLLARDISDATE)) =
CONCAT(Year(NOW()), 7), D.N_DOLLARVALUE, null)) as July,
SUM(IF(CONCAT(Year(D.D_DOLLARDISDATE), Month(D.D_DOLLARDISDATE)) =
CONCAT(Year(NOW()), 8), D.N_DOLLARVALUE, null)) as August,
SUM(IF(CONCAT(Year(D.D_DOLLARDISDATE), Month(D.D_DOLLARDISDATE)) =
CONCAT(Year(NOW()), 9), D.N_DOLLARVALUE, null)) as September,
SUM(IF(CONCAT(Year(D.D_DOLLARDISDATE), Month(D.D_DOLLARDISDATE)) =
CONCAT(Year(NOW()), 10), D.N_DOLLARVALUE, null)) as October,
SUM(IF(CONCAT(Year(D.D_DOLLARDISDATE), Month(D.D_DOLLARDISDATE)) =
CONCAT(Year(NOW()), 11), D.N_DOLLARVALUE, null)) as November

FROM
SALES AS S
LEFT OUTER JOIN ACCOUNT AS A ON S.FI_ACCOUNTID = A.PI_ACCOUNTID
LEFT OUTER JOIN IDENTIFIER AS I1 ON S.FI_IDENTID = I1.PI_IDENTID
LEFT OUTER JOIN IDENTIFIER AS I2 ON I1.N_IDENTCOMESUNDER = I2.PI_IDENTID
LEFT OUTER JOIN IDENTIFIER I3 ON I2.N_IDENTCOMESUNDER = I3.PI_IDENTID
LEFT OUTER JOIN FIELDVALUES AS F ON F.FI_CFID = 5 AND
F.FI_RECORDID = S.PI_SALESID
LEFT OUTER JOIN FIELDVALUES AS F2 ON F2.FI_CFID = 4 AND
F2.FI_RECORDID = S.PI_SALESID,
DISTRIBUTION AS D
WHERE
S.N_PROBABILITY = 100 AND
D.FI_SALESID = S.PI_SALESID
GROUP BY ContractNum, Level3, AccountName


Again, I'm brand new at this so there's probably all kinds of errors and poor approach issues. . .

Any idea how I can generate a "Total Pmts" column after the November column?

VBAjedi [swords]
 

> "including some stuff not relevant to my question"

oh, but it is...

luke, that query is astonishing, and i've been doing sql for over fifteen years

who showed you how to join? if no one, then congrats

what you have going on there in the first three columns is stupendous, and certainly well past beginner stage

your SUM() expressions for the months, while awkward, should work

your GROUP BY is wrong

GROUP BY must have every column in the SELECT list that's not in an aggregate expression

what i do, and what you've done, is put the non-aggregates first in the SELECT list, followed by the aggregates (your SUM()s are aggregates), so that you can simply copy n paste from the front of the SELECT list into the GROUP BY

then i go through the GROUP BY and remove all "as foo" aliases, leaving just the expression

for example, yours would start out like this --

[tt]GROUP
BY IF(LENGTH(I3.PI_IDENTID) > 0
, I3.V_IDENTNAME
, IF(LENGTH(I2.PI_IDENTID) > 0
, I2.V_IDENTNAME
, I1.V_IDENTNAME))
, IF(LENGTH(I3.PI_IDENTID) > 0
, I2.V_IDENTNAME
, IF(LENGTH(I2.PI_IDENTID) > 0
, I1.V_IDENTNAME
, ''))
, IF(LENGTH(I3.PI_IDENTID) > 0
, I1.V_IDENTNAME
, '')
, S.V_OWNER
, F2.V_TEXT
, S.V_RATING [/tt]


post again if you still have trouble
 
r937,

<sheepish grin> - Well, you certainly know how to heap on the praise, anyway! Two different forums couldn't generate a solution to what I was after in those first three columns, but I don't often accept &quot;impossible&quot; for an answer. So I did it myself. Nice to hear that it's not fundamentally flawed or something.

Ok, I implemented your correction on my Group By approach (although I had to read your post five times before I figured out the importance of doing it that way). It doubled the time my query takes (to five minutes!), but this monster will only be run once a day. . .

UNSOLVED ISSUES:
1) Still need help on how I can generate a &quot;Total Pmts - 12 Months&quot; column after the November column. In other words, how do I Total 12 SUM columns?

2) Once I have my 12 month total column, I want to filter out any returned records that have a total of 0 (i.e. sales with no payments scheduled in this year are not of interest to me). Any thoughts there?

Thanks so much for your help and encouragement. Both mean a lot to a new guy. . .


VBAjedi [swords]
 
1) just add them horizontally

[tt]select sum(...) as jan
, sum(...) as feb
, ...
, sum(...)
+sum(...)
+... as yeartotal[/tt]

2) GROUP BY HAVING

[tt]group by ...
having sum(...)
+sum(...)
+... > 0[/tt]

rudy
 
Rudy,

Took me a few days to get around to applying your suggestions. . . both worked beautifully. Have another star.

Thanks for your input and encouragement!

Luke

VBAjedi [swords]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top