Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Row Count Challenge

Status
Not open for further replies.

donutman

Programmer
Dec 20, 2002
2,481
0
0
US
The challenge is to find the most efficient way to generate a row counter within groups. I’m going to test it against my database of customers 190,000 distinct customers with 180,000 distinct customer orders (old ones have been warehoused) having 240,000 order details from 2800 products. I’m fairly sure I’ve indexed my tables properly.
The recordset that I want to generate is the top 50 distinct customers who have the most number of distinct items ordered within a single order. I’ve created example data and tables to make this fairly easy to test your SQL against, but let’s change the 50 to just top 2 for simplicity.
Here are a few special considerations:
[ul][li]If the same customer has more than one order only select the order that contains the most number of DISTINCT items ordered.[/li]
[li]The top 2 refers to the Count(Items) per customer order not to the top 2 customers consequently the final listing can have more than two customers with the same Count(Items).[/li]
[li]Use temp tables if you like, but it would be interesting to compare single query solutions against all others.[/li][/ul]
I haven’t done it yet, but my intention is to write a cursor approach for comparison. With the example data, you should get these results:
[tt]
# PDesc OId CName
1 Apple Fritter 10 Chrissie1
2 Caramel Pecal Roll 10 Chrissie1
3 Chocolate Donut 10 Chrissie1
4 Custard Bismark 10 Chrissie1
1 Caramel Pecal Roll 12 ESQuared
2 Chocolate Donut 12 ESQuared
3 Custard Bismark 12 ESQuared
1 Apple Fritter 13 SQLSister
2 Caramel Pecal Roll 13 SQLSister
3 Custard Bismark 13 SQLSister
[/tt]
Code:
[green]/*
DROP TABLE KarlsCustomer
DROP TABLE KarlsOrder
DROP TABLE KarlsOrderDetail
DROP TABLE KarlsProduct
*/
[/green][Blue]CREATE[/Blue] [Blue]TABLE[/Blue] KarlsCustomer
    [Gray]([/Gray]CId [Blue]int[/Blue] [Fuchsia]Identity[/Fuchsia] [Gray]([/Gray]1[Gray],[/Gray]1[Gray])[/Gray] [Blue]PRIMARY[/Blue] [Blue]KEY[/Blue][Gray],[/Gray]
     CName [Blue]varchar[/Blue][Gray]([/Gray]10[Gray])[/Gray][Gray])[/Gray]
[Blue]CREATE[/Blue] [Blue]TABLE[/Blue] KarlsOrder 
    [Gray]([/Gray]OId [Blue]int[/Blue] [Blue]PRIMARY[/Blue] [Blue]KEY[/Blue][Gray],[/Gray]
     CId [Blue]int[/Blue][Gray])[/Gray] 
[Blue]CREATE[/Blue] [Blue]TABLE[/Blue] KarlsOrderDetail
    [Gray]([/Gray]ODId [Blue]int[/Blue] [Blue]PRIMARY[/Blue] [Blue]KEY[/Blue][Gray],[/Gray]
     OId [Blue]int[/Blue][Gray],[/Gray]
     PId [Blue]int[/Blue][Gray])[/Gray]
[Blue]CREATE[/Blue] [Blue]TABLE[/Blue] KarlsProduct
    [Gray]([/Gray]PId [Blue]int[/Blue] [Fuchsia]Identity[/Fuchsia] [Gray]([/Gray]1[Gray],[/Gray]1[Gray])[/Gray] [Blue]PRIMARY[/Blue] [Blue]KEY[/Blue][Gray],[/Gray]
     PDesc [Blue]varchar[/Blue][Gray]([/Gray]20[Gray])[/Gray][Gray])[/Gray]

[Blue]INSERT[/Blue] [Blue]INTO[/Blue] KarlsCustomer
    [Blue]SELECT[/Blue] [red]'SQLSister'[/red] [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] [red]'vongrunt'[/red] [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] [red]'ESQuared'[/red] [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] [red]'Chrissie1'[/red]

[Blue]INSERT[/Blue] [Blue]INTO[/Blue] KarlsProduct
    [Blue]SELECT[/Blue] [red]'Chocolate Donut'[/red] [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] [red]'Apple Fritter'[/red] [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] [red]'Caramel Pecal Roll'[/red] [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] [red]'Custard Bismark'[/red]

[Blue]INSERT[/Blue] [Blue]INTO[/Blue] KarlsOrderDetail
    [Blue]SELECT[/Blue] 101[Gray],[/Gray]10[Gray],[/Gray]1 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 102[Gray],[/Gray]10[Gray],[/Gray]2 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 103[Gray],[/Gray]10[Gray],[/Gray]3 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 104[Gray],[/Gray]10[Gray],[/Gray]4 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 105[Gray],[/Gray]11[Gray],[/Gray]1 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 106[Gray],[/Gray]11[Gray],[/Gray]3 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 107[Gray],[/Gray]11[Gray],[/Gray]4 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 108[Gray],[/Gray]11[Gray],[/Gray]1 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 109[Gray],[/Gray]11[Gray],[/Gray]3 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 110[Gray],[/Gray]12[Gray],[/Gray]2 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 111[Gray],[/Gray]12[Gray],[/Gray]3 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 112[Gray],[/Gray]12[Gray],[/Gray]4 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 113[Gray],[/Gray]13[Gray],[/Gray]1 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 114[Gray],[/Gray]13[Gray],[/Gray]2 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 115[Gray],[/Gray]13[Gray],[/Gray]4 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 116[Gray],[/Gray]14[Gray],[/Gray]1 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 117[Gray],[/Gray]14[Gray],[/Gray]2 

[Blue]INSERT[/Blue] [Blue]INTO[/Blue] KarlsOrder
    [Blue]SELECT[/Blue] 10[Gray],[/Gray] 1 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 11[Gray],[/Gray] 1 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 12[Gray],[/Gray] 2 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 13[Gray],[/Gray] 3 [Blue]UNION[/Blue]
    [Blue]SELECT[/Blue] 14[Gray],[/Gray] 4
-Karl


[red] Cursors, triggers, user-defined functions and dynamic SQL are an axis of evil![/red]
[green]Life's uncertain...eat dessert first...www.deerfieldbakery.com[/green]
 
I have two objections to the solution:
Code:
[Blue]SELECT[/Blue] KarlsOrder.OID [Gray],[/Gray] CID [Gray],[/Gray] MaxOrderSize [Gray]=[/Gray] [Fuchsia]MAX[/Fuchsia][Gray]([/Gray]OrderSize[Gray])[/Gray] [Blue]INTO[/Blue] #ORDER
    [Blue]FROM[/Blue] KarlsOrder 
    [Blue]INNER[/Blue] [Gray]JOIN[/Gray]
     [Gray]([/Gray] [Blue]SELECT[/Blue] OID [Gray],[/Gray] [Fuchsia]COUNT[/Fuchsia][Gray]([/Gray][Blue]DISTINCT[/Blue] PID[Gray])[/Gray] [Blue]AS[/Blue] ORDERSIZE 
        [Blue]FROM[/Blue] KarlsOrderDetail [Blue]GROUP[/Blue] [Blue]BY[/Blue] OID 
     [Gray])[/Gray] DETAILS
    [Blue]ON[/Blue] KarlsOrder.OID [Gray]=[/Gray]     DETAILS.OID
    [Blue]GROUP[/Blue] [Blue]BY[/Blue] KarlsOrder.OID [Gray],[/Gray] KarlsOrder.CID
That code will create an excessively large temporary table when run on the production table. It creates one row for every order. My solution was to find the smallest OrderSize that would be included in the final result, then create a temporary table with a more limited set of candidate orders.
Code:
[Blue]SELECT[/Blue] [Blue]DISTINCT[/Blue] [Blue]TOP[/Blue] 2 
       MaxOrderSize [Gray]=[/Gray] [Fuchsia]MAX[/Fuchsia][Gray]([/Gray]MaxOrderSize[Gray])[/Gray] [Blue]FROM[/Blue] #ORDER
   [Blue]GROUP[/Blue] [Blue]BY[/Blue] CID [Blue]ORDER[/Blue] [Blue]BY[/Blue] MaxOrderSize [Blue]DESC[/Blue]
That code selects the top 2 based upon OrderSize. It should be selecting the top 2 from the CId then including other CId's who have the same OrderSize as the smallest of the top 2. As ESquared pointed out, the original post was not very clear on that.
When I applied the solution to my production tables, the top 2 was top 50, consequently almost every order was included since the #1 OrderSize was 57.
The solution that I previously posted and translated to work on the sample data is:
Code:
[Blue]DECLARE[/Blue] @Min [Blue]AS[/Blue] [Blue]int[/Blue]
[Blue]SELECT[/Blue] 
  [Blue]TOP[/Blue] 2 DT.CId[Gray],[/Gray] [Fuchsia]MAX[/Fuchsia][Gray]([/Gray]DT.ItemCount[Gray])[/Gray] ItemCount [Blue]INTO[/Blue] #Temp [Blue]FROM[/Blue]  
     [Gray]([/Gray][Blue]SELECT[/Blue] [Blue]TOP[/Blue] 100 [Blue]PERCENT[/Blue] CO.CId[Gray],[/Gray] CO.OId[Gray],[/Gray] [Fuchsia]COUNT[/Fuchsia][Gray]([/Gray][Blue]DISTINCT[/Blue] PId[Gray])[/Gray] ItemCount
      [Blue]FROM[/Blue] KarlsOrder CO [Blue]INNER[/Blue] [Gray]JOIN[/Gray] KarlsOrderDetail CGI
         [Blue]ON[/Blue] CO.OId[Gray]=[/Gray]CGI.OId
      [Blue]GROUP[/Blue] [Blue]BY[/Blue] CO.CId[Gray],[/Gray] CO.OId
      [Blue]ORDER[/Blue] [Blue]BY[/Blue] ItemCount [Blue]DESC[/Blue][Gray])[/Gray] DT
   [Blue]GROUP[/Blue] [Blue]BY[/Blue] CId
   [Blue]ORDER[/Blue] [Blue]BY[/Blue] ItemCount [Blue]DESC[/Blue]
[Blue]SET[/Blue] @Min[Gray]=[/Gray][Gray]([/Gray][Blue]SELECT[/Blue] [Fuchsia]MIN[/Fuchsia][Gray]([/Gray]ItemCount[Gray])[/Gray] [Blue]FROM[/Blue] #Temp[Gray])[/Gray]
[Blue]DROP[/Blue] [Blue]TABLE[/Blue] #Temp  
[Blue]SELECT[/Blue] CO.CId[Gray],[/Gray] CO.OId[Gray],[/Gray] [Fuchsia]COUNT[/Fuchsia][Gray]([/Gray][Blue]DISTINCT[/Blue] PId[Gray])[/Gray] ItemCount
   [Blue]INTO[/Blue] #TempCO
      [Blue]FROM[/Blue] KarlsOrder CO [Blue]INNER[/Blue] [Gray]JOIN[/Gray] KarlsOrderDetail CGI
         [Blue]ON[/Blue] CO.OId[Gray]=[/Gray]CGI.OId
      [Blue]GROUP[/Blue] [Blue]BY[/Blue] CO.CId[Gray],[/Gray] CO.OId
      [Blue]HAVING[/Blue] [Fuchsia]COUNT[/Fuchsia][Gray]([/Gray][Blue]DISTINCT[/Blue] PId[Gray])[/Gray][Gray]>[/Gray][Gray]=[/Gray]@Min 
[Blue]SELECT[/Blue] CId[Gray],[/Gray] [Fuchsia]MAX[/Fuchsia][Gray]([/Gray]OId[Gray])[/Gray] OId[Gray],[/Gray] [Fuchsia]MAX[/Fuchsia][Gray]([/Gray]ItemCount[Gray])[/Gray] ItemCount [Blue]INTO[/Blue] #TempCnt [Blue]FROM[/Blue] #TempCO T1  
   [Blue]WHERE[/Blue] ItemCount[Gray]=[/Gray][Gray]([/Gray][Blue]SELECT[/Blue] [Fuchsia]MAX[/Fuchsia][Gray]([/Gray]ItemCount[Gray])[/Gray]
                       [Blue]FROM[/Blue] #TempCO 
                       [Blue]WHERE[/Blue] CId[Gray]=[/Gray]T1.CId[Gray])[/Gray]
   [Blue]GROUP[/Blue] [Blue]BY[/Blue] CId
[Blue]SELECT[/Blue] RowCnt[Gray]=[/Gray][Gray]([/Gray][Blue]SELECT[/Blue] [Fuchsia]COUNT[/Fuchsia][Gray]([/Gray][Blue]DISTINCT[/Blue] PId[Gray])[/Gray] 
                  [Blue]FROM[/Blue] KarlsOrderDetail
                  [Blue]WHERE[/Blue] OId[Gray]=[/Gray]CGI.OId
                     [Gray]AND[/Gray] PId[Gray]<[/Gray][Gray]=[/Gray]CGI.PId[Gray])[/Gray][Gray],[/Gray] PDesc[Gray],[/Gray] CGI.OId[Gray],[/Gray] CName[Gray],[/Gray] ItemCount
   [Blue]FROM[/Blue] #TempCnt T1 [Blue]INNER[/Blue] [Gray]JOIN[/Gray] KarlsOrderDetail CGI
      [Blue]ON[/Blue] T1.OId[Gray]=[/Gray]CGI.OId [Blue]INNER[/Blue] [Gray]JOIN[/Gray] KarlsProduct KP
      [Blue]ON[/Blue] CGI.PId[Gray]=[/Gray]KP.PId [Blue]INNER[/Blue] [Gray]JOIN[/Gray] KarlsCustomer KC
      [Blue]ON[/Blue] KC.CId[Gray]=[/Gray]T1.CId
   [Blue]ORDER[/Blue] [Blue]BY[/Blue] ItemCount [Blue]DESC[/Blue][Gray],[/Gray] CGI.OId[Gray],[/Gray] RowCnt
[Blue]DROP[/Blue] [Blue]TABLE[/Blue] #TempCO
[Blue]DROP[/Blue] [Blue]TABLE[/Blue] #TempCnt
I did have to add an ItemCount to the printed output in order to get it to sort properly. I'm disappointed that I didn't find this to be a problem where a cursor would beat the best alternative solution, but I've learned from it. I hope to soon present a problem where the cursor will triumph.
-Karl

[red] Cursors, triggers, user-defined functions and dynamic SQL are an axis of evil![/red]
[green]Life's uncertain...eat dessert first...www.deerfieldbakery.com[/green]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top