Calculating a median value is a simple concept to the complex human mind. Simply put, the median value is the middle value in a list of
ordered values, with the same number of values above it as below it.
In a result set that contains 5 rows, row 3 will contain the median value (Example 1). When the result set contains an even number of rows, the median is calculated by averaging the two middle rows (Example 2).
Example 1 û Median=7 Example 2 û Median=5
[color blue]
[/color]
[color blue]
[/color]
The list of values must be ordered or the median calculated will be a random value. The difficulty for SQL Server developers lies in the fact that there is no order to SQL Server rows. This makes medians cumbersome to code.
The solution takes a few steps to pull off:
[ul][li]Create a holding table with an IDENTITY column to hold the ordered data[/li]
[li]INSERT the data into the holding table using an ORDER BY clause[/li]
[li]Capture the ROWCOUNT of the INSERT[/li]
[li]Do a little math to determine which row(s) you need to look at[/li]
[li]Use the IDENTITY column of the temp table to reference the appropriate rows and calculate the median[/li][/ul]
This is an example of how to calculate the median price in a table of prices:
[color maroon]
Code:
DECLARE @Median decimal(9,4)
, @mid int, @odd int, @mid1 int, @mid2 int
, @NumRows int
CREATE TABLE #Prices (
ID int IDENTITY(1,1)
, Price decimal(9,4)
)
INSERT INTO #Prices (Price)
SELECT Price
FROM PriceTable
Code:
SET @NumRows = @@ROWCOUNT
SET @odd = @NumRows % 2 --1 if odd number of rows, 0 if even
SET @mid = @NumRows / 2 --Number of the "middle" record
IF @NumRows = 1 BEGIN --Special trick so this works if only one record
SET @odd = 1
SET @mid = 1
END
-- Which row(s) hold the median value?
IF @odd = 1 BEGIN
--Odd number of rows - median is the middle one
SET @Median = (SELECT Price FROM #Prices WHERE ID = @mid)
END ELSE BEGIN
--Even number of rows - median is the average of the two middle ones
SET @mid1 = (SELECT Number FROM #Prices WHERE ID = @mid)
SET @mid2 = (SELECT Number FROM #Prices WHERE ID = @mid + 1)
SET @Median = (@mid1 + @mid2) / 2
END
SELECT @Median
[/color]
If, for instance, you have a table of prices by age and want to calculate the median price by age, you can wrap the above logic in a cursor to loop through the ages and calculate the median.
[color maroon]
Code:
--Holding table for median rows
CREATE TABLE #Prices (
ID int IDENTITY(1,1)
, Price decimal(9,4)
)
DECLARE @Age int, @Median decimal(9,4)
, @mid int, @odd int, @mid1 int, @mid2 int
, @NumRows int
DECLARE oCursor CURSOR FOR
SELECT DISTINCT Age
FROM PricesByAge
ORDER BY Age ASC
--Open said cursor and start stepping thru it
OPEN oCursor
WHILE 0=0 BEGIN
FETCH NEXT FROM oCursor INTO @Age
IF @@Fetch_Status<>0 BREAK
TRUNCATE TABLE #Prices
[color red]
Code:
--Insert the data for a given age
INSERT INTO #Prices (Price)
SELECT Price
FROM PricesByAge
WHERE Age = @Age
ORDER BY Price
--Now actually compute the median
SET @NumRows = @@ROWCOUNT
SET @odd = @NumRows % 2 --1 if odd number of recs, 0 if even
SET @mid = @NumRows / 2 --Number of the "middle" record
IF @NumRows = 1 BEGIN --Special trick so this works if only one record
SET @odd = 1
SET @mid = 1
END
-- Which row(s) hold the median value?
IF @odd = 1 BEGIN
--Odd number of rows - median is the middle one
SET @Median = (SELECT Price FROM #Prices WHERE ID = @mid)
END ELSE BEGIN
--Even number of rows - median is the average of the two middle ones
SET @mid1 = (SELECT Price FROM #Prices WHERE ID = @mid)
SET @mid2 = (SELECT Price FROM #Prices WHERE ID = @mid + 1)
SET @Median = (@mid1 + @mid2) / 2
END
SELECT @Age AS Age, @Median AS MedianPrice
[color maroon]
Code:
END
CLOSE oCursor
DEALLOCATE oCursor
[/color]