Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Westi on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Row Count Reset On New Group 1

Status
Not open for further replies.

MSGCR

Technical User
Sep 16, 2004
16
US
My feeble SQL skills have failed me on this one. I ned a row count on buildings per location to reset on a new location. Can this be done in MS SQL?

Example:
Location Building Row_Count
1 A 1
1 B 2
1 C 3
2 A 1
2 B 2
 
Ok, I know I've beat this to death and I promise to get off my soapbox, but I've thought of a good explanation for the poor performance of cursors that I think will help people understand (if it's true) why cursors are so bad.
When T-SQL uses a cursor to loop through a table of potentially millions of rows, you are asking it to retrieve a row by using low-level code with the fetch statement (which is good). However, it then has to return to the calling procedure (SP) using high-level code. Even if it has to do this in a small loop of say 100,000 that is like writing a SP with 100,000 lines of T-SQL times the number of lines in the loop. Surely that will not perform well. I suspect that the SQL compiler isn't like a normal language compiler. I'm betting that it doesn't covert the entire procedure into a low-level language and then submit that object code to an assembler for execution. Instead it creates object modules for each T-SQL statement assembles and executes them individually.
A conservative way to test this would be to write a VB program that did nothing more than looped though 1 million rows and compare its performance with that of a cursor. Ideally the VB program should be run on the server so that network traffic isn't a factor.
I've tried a purely SQL test which I think tends to support my theory.
Code:
[Blue]DECLARE[/Blue] @Loop [Blue]AS[/Blue] [Blue]int[/Blue]
[Blue]SET[/Blue] @Loop[Gray]=[/Gray]200[green]
/************* Test One *************/
[/green][Blue]SELECT[/Blue] [Fuchsia]GetDate[/Fuchsia][Gray]([/Gray][Gray])[/Gray]
[Blue]DECLARE[/Blue] @CoBarCode [Blue]AS[/Blue] [Blue]int[/Blue]
[Blue]WHILE[/Blue] @Loop[Gray]>[/Gray]0 [Blue]BEGIN[/Blue]
   [Blue]DECLARE[/Blue] MyCursor [Blue]cursor[/Blue]
      LOCAL FORWARD_ONLY for
      [Blue]SELECT[/Blue] COBarCode [Blue]FROM[/Blue] CustomerOrder 
         [Blue]WHERE[/Blue] COBarCode[Gray]=[/Gray]1002006 FOR [Blue]UPDATE[/Blue]
   [Blue]OPEN[/Blue] MyCursor
   [Blue]FETCH[/Blue] next [Blue]FROM[/Blue] MyCursor [Blue]INTO[/Blue] @CoBarCode
   [Blue]SET[/Blue] @CoBarCode[Gray]=[/Gray]1002006
   Close MyCursor
   deallocate MyCursor
   [Blue]SET[/Blue] @Loop[Gray]=[/Gray]@Loop[Gray]-[/Gray]1
[Blue]END[/Blue]
[Blue]SELECT[/Blue] [Fuchsia]GetDate[/Fuchsia][Gray]([/Gray][Gray])[/Gray][green]
/************* Test Two *************/
[/green][Blue]SET[/Blue] @Loop[Gray]=[/Gray]200 
[Blue]WHILE[/Blue] @Loop[Gray]>[/Gray]0 [Blue]BEGIN[/Blue]
   [Blue]UPDATE[/Blue] CustomerOrder
      [Blue]SET[/Blue] @COBarCode[Gray]=[/Gray]1002006
      [Blue]WHERE[/Blue] COBarCode[Gray]=[/Gray]1002006
   [Blue]SET[/Blue] @Loop[Gray]=[/Gray]@Loop[Gray]-[/Gray]1
[Blue]END[/Blue]
[Blue]SELECT[/Blue] [Fuchsia]GetDate[/Fuchsia][Gray]([/Gray][Gray])[/Gray][green]
/************* Test Three *************/
[/green][Blue]UPDATE[/Blue] CustomerOrder
   [Blue]SET[/Blue] COBarCode[Gray]=[/Gray]COBarCode
   [Blue]WHERE[/Blue] COBarCode [Blue]IN[/Blue] [Gray]([/Gray][Blue]SELECT[/Blue] [Blue]TOP[/Blue] 200 COBarCode [Blue]FROM[/Blue] CustomerOrder[Gray])[/Gray]
[Blue]SELECT[/Blue] [Fuchsia]GetDate[/Fuchsia][Gray]([/Gray][Gray])[/Gray]
2004[Gray]-[/Gray]12[Gray]-[/Gray]22 13:59:38.470
2004[Gray]-[/Gray]12[Gray]-[/Gray]22 13:59:38.487
2004[Gray]-[/Gray]12[Gray]-[/Gray]22 13:59:38.503
2004[Gray]-[/Gray]12[Gray]-[/Gray]22 13:59:38.520

The output of the Selects:
2004-12-22 13:59:38.470
2004-12-22 13:59:38.487
2004-12-22 13:59:38.503
2004-12-22 13:59:38.520

So allocating a cursor, opening it, fetching it, closing it and deallocating it 200 times isn't much different than set-based updates. Looks like we have to be carefull about general statements of how bad they are!
-Karl
P.S. Looks like I have to add some more keywords to my SQL formatter database. :)

[red] Cursors, triggers, user-defined functions and dynamic SQL are an axis of evil![/red]
[green]Life's uncertain...eat dessert first...www.deerfieldbakery.com[/green]
 
K, I'm new to all of what you are talking about, but I'm trying to learn more..........wouldn't 200 rows be considered too small to determine the speed of your algorithms? What if it were 50,000,000 rows? From your tests that you posted, it looks like the efficiency of your first test is n squared, second test is n and so is the third test. The amount of memory used for the n squared algorithm is relativly small, because the row count is small. If the row count were bigger, wouldn't your results be much much different? Can someone explain this more to me? am I on the right train of thought, or not?
 
I'm trying to find out exactly when cursors are terrible. It's clear that if you use them to filter rows in a large table, you will bring your server to its knees. However, it looks to me that there may be certain types of problems that cursors would be an acceptable alternative even if a set-based solution is possible. I'm going to present a performance challenge shortly.
-Karl

[red] Cursors, triggers, user-defined functions and dynamic SQL are an axis of evil![/red]
[green]Life's uncertain...eat dessert first...www.deerfieldbakery.com[/green]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top