Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Calculating a median

T-SQL Hints and Tips

Calculating a median

by  JohnDTampaBay  Posted    (Edited  )
Calculating a median value is a simple concept to the complex human mind. Simply put, the median value is the middle value in a list of ordered values, with the same number of values above it as below it.

In a result set that contains 5 rows, row 3 will contain the median value (Example 1). When the result set contains an even number of rows, the median is calculated by averaging the two middle rows (Example 2).

Example 1 û Median=7 Example 2 û Median=5
Code:
         1                      1
         2                      2
[color blue]
Code:
         7                      3
[/color]
Code:
         8
[color blue]
Code:
                      7
[/color]
Code:
         9                      8
                                9

The list of values must be ordered or the median calculated will be a random value. The difficulty for SQL Server developers lies in the fact that there is no order to SQL Server rows. This makes medians cumbersome to code.

The solution takes a few steps to pull off:
[ul][li]Create a holding table with an IDENTITY column to hold the ordered data[/li]
[li]INSERT the data into the holding table using an ORDER BY clause[/li]
[li]Capture the ROWCOUNT of the INSERT[/li]
[li]Do a little math to determine which row(s) you need to look at[/li]
[li]Use the IDENTITY column of the temp table to reference the appropriate rows and calculate the median[/li][/ul]
This is an example of how to calculate the median price in a table of prices:

[color maroon]
Code:
DECLARE @Median decimal(9,4)
    , @mid int, @odd int, @mid1 int, @mid2 int
    , @NumRows  int

CREATE TABLE #Prices (
    ID int IDENTITY(1,1)
    , Price decimal(9,4)
)

INSERT INTO #Prices (Price)
SELECT Price
FROM PriceTable
Code:
ORDER BY Price
Code:
SET @NumRows  = @@ROWCOUNT    
SET @odd = @NumRows  % 2 --1 if odd number of rows, 0 if even
SET @mid = @NumRows  / 2 --Number of the "middle" record

IF @NumRows  = 1 BEGIN   --Special trick so this works if only one record
    SET @odd = 1
    SET @mid = 1
END

-- Which row(s) hold the median value?
IF @odd = 1 BEGIN
    --Odd number of rows - median is the middle one
    SET @Median = (SELECT Price FROM #Prices WHERE ID = @mid)
END ELSE BEGIN
    --Even number of rows - median is the average of the two middle ones
    SET @mid1 = (SELECT Number FROM #Prices WHERE ID = @mid)
    SET @mid2 = (SELECT Number FROM #Prices WHERE ID = @mid + 1)
    SET @Median = (@mid1 + @mid2) / 2
END
    
SELECT @Median
[/color]

If, for instance, you have a table of prices by age and want to calculate the median price by age, you can wrap the above logic in a cursor to loop through the ages and calculate the median.

[color maroon]
Code:
--Holding table for median rows
CREATE TABLE #Prices (
    ID int IDENTITY(1,1)
    , Price decimal(9,4)
)

DECLARE @Age int, @Median decimal(9,4)
    , @mid int, @odd int, @mid1 int, @mid2 int
    , @NumRows  int

DECLARE oCursor CURSOR FOR
    SELECT DISTINCT Age
    FROM PricesByAge
    ORDER BY Age ASC

--Open said cursor and start stepping thru it
OPEN oCursor

WHILE 0=0 BEGIN
    FETCH NEXT FROM oCursor INTO @Age
    IF @@Fetch_Status<>0 BREAK

    TRUNCATE TABLE #Prices
[color red]
Code:
    --Insert the data for a given age
    INSERT INTO #Prices (Price)
    SELECT Price
    FROM PricesByAge
    WHERE Age = @Age
    ORDER BY Price


    --Now actually compute the median
    SET @NumRows  = @@ROWCOUNT    
    SET @odd = @NumRows  % 2        --1 if odd number of recs, 0 if even
    SET @mid = @NumRows  / 2        --Number of the "middle" record

    IF @NumRows  = 1 BEGIN            --Special trick so this works if only one record
        SET @odd = 1
        SET @mid = 1
    END

    -- Which row(s) hold the median value?
    IF @odd = 1 BEGIN
        --Odd number of rows - median is the middle one
        SET @Median = (SELECT Price FROM #Prices WHERE ID = @mid)
    END ELSE BEGIN
        --Even number of rows - median is the average of the two middle ones
        SET @mid1 = (SELECT Price FROM #Prices WHERE ID = @mid)
        SET @mid2 = (SELECT Price FROM #Prices WHERE ID = @mid + 1)
        SET @Median = (@mid1 + @mid2) / 2
    END
    
    SELECT @Age AS Age, @Median AS MedianPrice
[color maroon]
Code:
END

CLOSE oCursor 
DEALLOCATE oCursor
[/color]
Register to rate this FAQ  : BAD 1 2 3 4 5 6 7 8 9 10 GOOD
Please Note: 1 is Bad, 10 is Good :-)

Part and Inventory Search

Back
Top