Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Decipher Sql Server Script containing row number over partition 1

Status
Not open for further replies.

BxWill

MIS
Mar 30, 2009
367
US
Timely insight needed!

Been tasked with trouble shooting a archived sql script to ensure that it is extracting the correct data
(The person who created the sql script is not available...)

Relatively new to Sql Server sql scripts involving "ROW_NUMBER() OVER PARTITION."

What exactly is the sql script performing?

Based on a cursory review, it appears that it is extracting the 100 distinct account numbers where the Load_Date is the most recent.

Is this correct?

Why is "ROW_NUMBER() OVER PARTITION" used? Can this be accomplished by another method?

Of course, I am running the sql script via Management Studio but not readily able to determine if the Sql script can be simplified at this point.

Thanks in advance.


Code:
SELECT AccountNo, Max, Bonus
FROM (SELECT TOP (100) PERCENT AccountNo, Max, SUM(Bonus) AS Bonus, ROW_NUMBER() OVER (PARTITION BY [AccountNo] ORDER BY AccountNo, Max DESC) AS Row 
FROM (SELECT DISTINCT TOP (100) PERCENT AccountNo, ROUND(Bonus, 0) AS Bonus, MAX(Load_Date) AS Max
FROM dbo.Sales AS TMC
GROUP BY AccountNo, ROUND(Bonus, 0)
ORDER BY AccountNo) AS Tmc1
GROUP BY AccountNo, Max
ORDER BY AccountNo, Max DESC) AS Tmc2
WHERE (Row = 1)
ORDER BY AccountNo
 
This query:

Code:
SELECT  DISTINCT TOP (100) PERCENT 
        AccountNo, 
        ROUND(Bonus, 0) AS Bonus, 
        MAX(Load_Date) AS Max
FROM    dbo.Sales AS TMC
GROUP BY AccountNo, ROUND(Bonus, 0)
ORDER BY AccountNo

The query above is the basis of the total query. Notice that the query is grouped by AccountNo and Bonus. It is (at least theoretically) possible for multiple rows for each AccountNo. Because of this, the row_number function was implemented so that the data could eventually be filtered to just the top 1.

The Row_Number function is nothing magical, but it is certainly useful in many different queries. I encourage you to use this as an opportunity to learn how it is used.

Code:
ROW_NUMBER() OVER (PARTITION BY [AccountNo] ORDER BY AccountNo, Max DESC) AS Row

Row Number, without a partition by, will simply add an row to the output that starts at 1 and increments by one for each row in the output. The ordering of the row numbers is based on the ORDER BY part on the over clause. In your example, it orders first by AccountNo and then by the Max column descending.

The trick here is with the partition by clause. When you use partition by, the row numbering will restart at 1 every time the field(s) in the partition by changes. In this case, the partition by column is AccountNo. This means, that every time the AccountNo changes, the row number function. The row numbering within the partitions is based on the order by clause within the OVER clause.

BTW, it's not necessary to include the AccountNo in the Order By because it is partitioned by AccountNo.

Code:
ROW_NUMBER() OVER (PARTITION BY [AccountNo] ORDER BY [s]AccountNo, [/s]Max DESC) AS Row



-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Thanks for the insight. Will review.

Overall, it is making much more sense.

What is still somewhat confusing is the multiple "FROM" clauses and the multiple "GROUP" clauses.

Initially, I thought that I could simplify by creating a temp table and loading all of the records...

So, it appears that the query is initially extracting the top 100 accounts with the most recent load dates and is not considering the top 100 accounts with the largest bonus.

Is this correct?

What modifications are necessary if I wanted to extract the accounts with the largest bonus?

Maybe instead of "MAX(Load_Date) AS Max", I will change to "Max Round(Bonus,0) as Max."

 
Does this make more sense?
Code:
SELECT AccountNo, Max, Bonus
FROM 
    (SELECT TOP (100) PERCENT AccountNo, Max, SUM(Bonus) AS Bonus, 
                    ROW_NUMBER() OVER (PARTITION BY [AccountNo] ORDER BY AccountNo, Max DESC) AS Row 
    FROM 
        (SELECT DISTINCT TOP (100) PERCENT AccountNo, ROUND(Bonus, 0) AS Bonus, MAX(Load_Date) AS Max
        FROM dbo.Sales AS TMC
        GROUP BY AccountNo, ROUND(Bonus, 0)
        ORDER BY AccountNo
        ) AS Tmc1
    GROUP BY AccountNo, Max
    ORDER BY AccountNo, Max DESC
    ) AS Tmc2
WHERE (Row = 1)
ORDER BY AccountNo
formating the data is very helpful in reading

djj
The Lord is my shepherd (Psalm 23) - I need someone to lead me!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top