Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Explain GROUP BY 1

Status
Not open for further replies.

uniopp

Technical User
Oct 7, 2001
152
JP
Hi,
Could someone give me a basic explaination of what GROUP BY does? It appears to me to be similar to DISTINCT.
Does it just group together rows with the same value?
So in the following example -

SELECT colname1, colname2, colname3
FROM tablename
WHERE colname1 = something
GROUP BY colname1

does it just group together all the colname1 values that are the same and only return one result for that value???

What happens if I put two colnames in the GROUP BY like -

SELECT colname1, colname2, colname3
FROM tablename
WHERE colname1 = something
GROUP BY colname1, colname2

what does that do??

I have read through a lot of posts and tutorials but kind find a simple explaination.
Thank you.
Simon.



 
it is possible that you are getting confused by the way mysql allows you to have invalid GROUP BY constructions...

... such as your two examples

you are right, GROUP BY is very much like DISTINCT

the difference is that with GROUP BY you are allowed aggregate functions

when the SELECT list contains expressions involving aggregate functions, you may or may not need a GROUP BY clause

for example

select count(*), sum(amount)
from yourtable

will give you the number of rows in the table and the sum of the amount column -- in effect, the entire table is the group

but notice, that SELECT list contains only aggregate expressions

the minute you have a non-aggregate expression in the SELECT list, those columns must be in the GROUP BY

select dept, count(*), sum(amount)
from yourtable
group by dept

will give rows and amount totals by dept

coming back to your examples, you can see that the non-aggregate columns in the SELECT list are not all in the GROUP BY

only mysql lets you get away with this, and they do caution you that results are unpredictable (it's in their docs somewhere, i'm too lazy to get the url right now)

the best way to learn GROUP BY is to run it a few times on your own tables

remember, all non-aggregate columns in the SELECT list must be in the GROUP BY


rudy
 
Hi,
Thank you rudy for you help.
I think I am starting to understand but would appreciate a little more feedback with the following table example.


c1 c2 c3 c4 c5
-----------------
A ! 1 ! x ! 2 ! w
B ! 2 ! y ! 3 ! o
A ! 3 ! x ! 1 ! w
A ! 4 ! x ! 2 ! w
C ! 1 ! z ! 3 ! k
A ! 2 ! x ! 1 ! w
B ! 2 ! y ! 1 ! o
D ! 1 ! z ! 3 ! w
D ! 3 ! z ! 2 ! w


What I want my query to do is get only ONE result for each c1 where c5 is equal to w. ( c2 and c4 have random values and I don't need to use that data in this query). In all cases of row A - c3 will always be x and c5 will always be w (this applies to all rows respectively). I don't need to do any calculations,counting, etc. in the query.

So what I want to return from the above table is -

c1 c3 c5
-----------------
A ! x ! w
D ! z ! w

Can I use either of the following queries or is one better than the other?

select distinct c1, c3, c5
from t
where c5=w
order by c1

OR

select c1, c3, c5
from t
where c5=w
group by c1, c3, c5
order by c1

Also, what do "aggregations on columns " and "aggregate function" mean?

Thank you.
Simon.

 
looks like you understand grouping quite well

both of the queries that you have given will work, both will produce the same results, and, theoretically anyway, both should perform the same

as for your closing question about aggregates, those are simply what you suggested -- "calculations, counting, etc."

aggregates work on a set of values -- a column, if you like

this column of values, i.e. which values are in the set, are determined by the groups

i prefer to use GROUP BY even if the query does not involve aggregates, because to add an aggregate calculation is a trivial change, whereas to add an aggregate calculation to a query written as a DISTINCT means rewriting it


rudy
 
rudi,
Thanks for your lightning fast response.
I think I've got the hang of it.
Simon.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top