Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

DISTINCT OR GROUP BY 2

Status
Not open for further replies.

suoirotciv

Programmer
Dec 3, 2002
205
0
0
PH
which is faster???

Distinct or Group By in SQL Please pardon the grammar.
Not good in english.
 
I'm don't care about your English.
But Distinct and Group by are two keyword for different purpose. So we can't compare them.
Please study more SQL if you want know more.
Good Luck and Have Fun
KieuTV
 
I'm don't care about your English.
But Distinct and Group by are two keyword for different purpose. So we can't compare them.
Please study more SQL if you want know more.
Good Luck and Have Fun
KieuTV
 
As Always, it depends.

The Group By is generally used in conjunction with functions for summarizing data (Average, Sum, etc)

Using Group By, if you have fields that will use a function, of course that will take longer than with-out those fields.


So, I guess, when comparing Distinct and Group By, we are comparing only an Sql statement which returns only fields and no functions (which anyways, only GROUP BY can do).

So the next factor, is the number of fields being returned.

When returning only one or two fields in a large table, then the DISTINCT will/should be much faster under normal circumstances.

However, when returning more than just a few fields from a large table (4,5,n), then the GROUP BY will probably be faster.

The number of fields vs. SELECT method vs. speed mentioned above is of course dependent on several factors, such as number of records and number of fields in the table itself.
So, I am just using an average size as criteria with respect to which method is faster based upon the number of fields returned.

Now, there are other factors, such as criteria and sorting that will affect the speed.

Criteria in a GROUPY BY using HAVING instead of WHERE will slow the results down more so.

Sorting the records is probably the biggest speed killer.
Always best not to use an ORDER BY if it is not really needed, and/or leave it up to the user to sort the records if desired.
Or, if using a client cursor, use the ADO's Sort method after the data is collected. The sort will then be done locally using the cached indexes.




[/b][/i][/u][sub]*******************************************************
General remarks:
If this post contains any suggestions for the use or distribution of code, components or files of any sort, it is still your responsibility to assure that you have the proper license and distribution rights to do so!
 
kieutv - sorry but i'm not that good that's why i'm asking . . . coz i need it right now . . . but if i dont need it right away . . . your right, i need to study more . . .

CCLINT - you got it right for me . . . and thank you for that . . . and pardon me if i dont get my request specific . . . but you really got what's on my mind . . . again, THANK YOU
Please pardon the grammar.
Not good in english.
 
I have to disagree a bit with CClint. Sorting isn't a killer to performance if proper indexs are in place. If no indexs are in place then it would be.

Group By is used for agrogated functions and thus is used after the result set is generated. Distinct clause may or may not cause a 2nd pass over the result set depending on what indexes may have been used in the initial reading of data. The more columns the more likely that a 2nd pass would be need but it should be faster or at least as fast as a group by on the same query.
 
SemperFiDownUnda - Thanks for the idea.

But i want to add another question regarding this . . . WHAT IS THE BEST WAY TO KNOW IF A CERTAIN FIELD/S NEEDS AN INDEX? Please pardon the grammar.
Not good in english.
 
This normally comes from looking at requirements.

Normally you can get your basic indexes from looking at reports, queries and screens. If you sort or search on a particular field often that is a prime canidate for an index.

Putting indexes on fields does slow down adding data so indexing everything most of the time isn't a real option.

After you have your database designed and implace with your application running (or even before if you know what queries will be run by the application) there are tools like MS SQL Server Performance and Tuning wizard. This traces the activity of the database for a given amount of time and analyses the results. It then makes suggestions on what indexes should be added by looking at the performance gain that would be achieved. It even allows you to impliment them with a click of the button.

This uses the same system as the graphical execution window in Query analyser. There you can see what your SQL statement execution plan looks like in, funny enough, graphical fashion. Here you can see what parts of the statement are taking the most time and focus in on the problems fairly easily.

It doesn't replace a good DBA but can make a good dent in what needs to be done. And for many systems can easily bring preformance up to par.....as long as the initial design is sound to begin with.

 
again thank you for the tip . . . Please pardon the grammar.
Not good in english.
 
Pardon my grammer I'm not good at English either .... but then agian I'm confused and an American living in Australia 8)
 

>I have to disagree a bit with CClint. Sorting isn't a killer to performance if proper indexs are in place.

Show me a query with a Sort and I will show one that is faster with-out the sort.

Certain factors will make a difference, such as cursor location and even what DBMS is being used.

But sorting will always make a difference in speed, which increases dramatically the more records there are.

suoirotciv has not mentioned where the data is coming from. When you answer these questions, take into consideration that not everyone is using something like SQL Server, or even ADO.
But, regardless, Sorting will always slow the querying of records down.

> The more columns the more likely that a 2nd pass would be need but it should be faster or at least as fast as a group by on the same query.

Again, not quite. The more columns, the slower DISTINCT will be, as compared to GROUP BY.

Do some testing with using DISTINCT and GROUP BY on, for instance, a table with 30 or 40 fields, 50,000 records, and a query on 5 or 10 fields.
Use the QueryPerfomance count and see how long it takes for the recordset to get it's data at the work station, on an average PC.

I'm sure you will see quite a difference.



[/b][/i][/u][sub]*******************************************************
General remarks:
If this post contains any suggestions for the use or distribution of code, components or files of any sort, it is still your responsibility to assure that you have the proper license and distribution rights to do so!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top