Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Top Ten Number of Occurences with Union too slow 1

Status
Not open for further replies.

larrydavid

Programmer
Jul 22, 2010
174
US
Hello,

Right now, in SS2005, to get the Top 10 number of occurences of codes that span 3 columns, I am using a dervied table with an outer and inner join, the inner join being a union:

Code:
SELECT TOP 10 [TEST_CODES], [CODES_count]  FROM
(
SELECT TOP 10 [TEST_CODES] = CASE WHEN t.CODE1 = '' THEN 'Blank' ELSE t.CODE1 END,    
    [CODES_count] = COUNT(*)     
FROM   TEST t
GROUP BY  t.[CODE1]
union all
SELECT TOP 10 CASE WHEN t.CODE2 = '' THEN 'Blank' ELSE t.CODE2 END,    
    COUNT(*) AS CODE2_count    
FROM   TEST t
GROUP BY  t.[CODE2]
union all
SELECT TOP 10 CASE WHEN t.CODE3= '' THEN 'Blank' ELSE t.CODE3 END,   
    COUNT(*) AS CODE3_count    

FROM   TEST T
GROUP BY  t.[CODE3]
)x 
ORDER BY [CODES_count] DESC

Is there a better way to do this? Right now the query is taking just under 2 minutes to run. I'd like to get it down to under 30 seconds. Any suggestions greatly appreciated.

Thanks,
Larry
 
I didn't think that a TOP query worked without an ORDER BY? It might not be giving the results you expect.

Anyway, you could try grouping and sorting it just once:
Code:
SELECT TOP 10 
  CODE
, CODES_COUNT
FROM
(
  SELECT
    CODE
  , count(*) CODES_COUNT
  FROM
  (
    SELECT
      CASE WHEN t.CODE1 = '' THEN 'Blank' ELSE t.CODE1 END CODE
    FROM   TEST t
    union all
    SELECT 
      CASE WHEN t.CODE2 = '' THEN 'Blank' ELSE t.CODE2 END CODE
    FROM   TEST t
    union all
    SELECT 
      CASE WHEN t.CODE3 = '' THEN 'Blank' ELSE t.CODE3 END CODE
    FROM   TEST t
  )x 
  GROUP BY
    CODE
) y
ORDER BY [CODES_COUNT] DESC

HTH
Simon.
 
Please understand that the query presented by Simon can return different results. This would happen when you have the same code in different columns. The query presented by Simon will count all 3 columns (adding the count from each column) while your query will only show the count from the column where it's in the top 10.

In this case, I think you will likely realize a performance improvement with indexes. Specifically, a separate index on each of those columns.

Also, you may want to re-think the table structure here. Since you are interested in combining the data from 3 columns in to a single one, that probably means that the data in each one represents the same thing. As such, you should probably consider creating another table that allows for storing of the codes.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Good advice as always, George. I did make the (baseless) assumption that Larry is interested in the top 10 by total occurrence, not per column.
 
Simon,

I am inclined to agree with your assumption, but in cases like this it is probably best to mention it so that there are no surprises.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Thanks you guys for your help! Simon, this worked a treat! Awesome, thanks so much everyone for your help! I love this forum! Thanks Tek Tips!
 
Down from 1:53 seconds to 30 seconds and the results are spot on!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top