Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SQL help - pull avg from TOP 3 for each customer in large Database 1

Status
Not open for further replies.

TrentGreenawalt

Technical User
Jan 18, 2002
159
US
I am trying to get an avg of the "avg fast slow" days of the latest (top3) for each customer ID. Given the data below is a sample table, I actually have 1000's of customers and over 100k records in this table. As you will see there is no entry in a given month if there are no bills paid, so I am not able to use a greater than query. I have played around with TOP 3 and that works for a particular customer, but not for all customers.

Customer ID Year Month Avg fast slow days
1234 2008 5 -5
1234 2008 4 -3
1234 2008 2 2
1234 2008 1 3
1234 2007 11 1
1234 2007 9 -3
2345 2008 5 -2
2345 2008 3 5
2345 2008 1 -2
2345 2007 12 1
9999 2008 5 2
9999 2008 4 1
9999 2008 3 5

Here is the query so far:

This will give the avg of all of the slow fast day columns, but I only need the top 3 most recent:

Select customer_id, avg(avg_fast_Slow_days)
from Table1
Group by customer_id, avg_fast_slow_days
Order by 2 desc



Then this is where I am at with the Top 3:
Select top 3 customer_id, avg(AVG_Fast_Slow_Days), year_invoiced, month_invoiced
from table1
where customer_ID = 9999
Group by customer_Id, avg_fast_slow_days, year_invoiced, Month_invoiced
Order by 3 desc,4 desc

I know this an obvious one line of code fix, but I just can't seem to get it.


I am simply looking to output

Cust ID | 3 months Avg
9999 2.5
1234 1.5
I made up these numbers but you should get the gist.

Thanks for the help,

Trent
 
I think I am getting closer:
Select customer_id,
(select top 3 avg(AVG_Fast_Slow_Days)
from table1 X
Where x.customer_id = Y.customer_id)
from table1 Y
Group by customer_Id
Order by 1 asc

Though the values still aren't the same as our ERP screen:( Does this look right so far?
 
Thanks Denis, that is definitly pointing me in the right direction. I think I can get it the other way as well, but this looks a lot cleaner then having all the subqueries like I was building.

I will post a solution unless someone else gets it first.

Thanks again for the help,
Trent
 
Well I am still stuck on this. Neither method has given me results that I need.

I have managed to get good data (Top 3) if I hard code in the cust_id:

select top 3 customer_id, (AVG_Fast_Slow_Days), year_invoiced, month_invoiced
from Table1 X
Where x.customer_id =100078
AND (year_invoiced = datepart(yyyy, getdate())AND month_invoiced <= (datepart(mm, getdate())-1)
OR Year_invoiced < datepart(yyyy, getdate()))
order by 3 desc, 4 desc

(This outputs the top 3 results for the customer, but still doesn't avg them)

But when I try to get this to output for all customers, it doesn't work. I can't get the avg to work either. This is really aggravating.

To top that off, I just manually checked a value in the system to see if the calculate even equals what I enter in on a calculator and the value is off. Leave it to a propritary ERP system to provide a value, but then not tell me how it is derrived. ARG.

If anyone can get this to run through all customers and provide one line per customer with the top 3 avg'd to one, I am forever in your debt.

Thanks,
Trent
 
What version of SQL Server are you using?

If you're not sure, run this...

Select @@Version


-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
I would approach this using a table variable solution.

The idea is... create a table variable with an identity column that you can use for this purpose. Then, with a little crafty sql, you should be able to get your results.

In the code I show below, I create a table variable (@YourTable) that represents your table. I do this so that I can hard-code some dummy data to test the query.

Then, you'll see that I create a table variable (@Temp) with an identity column. This column can be used to help you get the top 3 per customer. Once you have that, the rest is pretty easy. I wonder if the reason you were getting the wrong data was due to data type issues. If you average an integer column, the result is an integer, and can therefore be misleading.

Anyway, the code is...

Code:
[green]-- Set up some dummy data to test[/green]
Declare @YourTable Table(CustomerID Int, Year_Invoiced Int, Month_Invoiced Int, AVG_Fast_Slow_Days Int)

Insert Into @YourTable Values(1234,2008,   5 ,  -5)
Insert Into @YourTable Values(1234,2008,   4 ,  -3)
Insert Into @YourTable Values(1234,2008,   2 ,   2)
Insert Into @YourTable Values(1234,2008,   1 ,   3)
Insert Into @YourTable Values(1234,2007,   11,   1)
Insert Into @YourTable Values(1234,2007,   9 ,  -3)
Insert Into @YourTable Values(2345,2008,   5 ,  -2)
Insert Into @YourTable Values(2345,2008,   3 ,   5)
Insert Into @YourTable Values(2345,2008,   1 ,  -2)
Insert Into @YourTable Values(2345,2007,  12 ,   1)
Insert Into @YourTable Values(9999,2008,   5 ,   2)
Insert Into @YourTable Values(9999,2008,   4 ,   1)
Insert Into @YourTable Values(9999,2008,   3 ,   5)

[green]-- Create a table variable to help us with top 3[/green]
Declare @Temp Table(RowId Int Identity(1,1), CustomerId Int, Year_Invoiced Int, Month_Invoiced Int, AVG_Fast_Slow_days Int)

[green]-- Insert in to the table variable.  The order by at the end is very important.[/green]
Insert Into @Temp(CustomerId, Year_Invoiced, Month_Invoiced, AVG_Fast_Slow_days)
Select CustomerId, Year_Invoiced, Month_Invoiced, AVG_Fast_Slow_days
From   @YourTable
Order By CustomerId, Year_Invoiced DESC, Month_Invoiced Desc

[green]-- Now, get the top 3 per customer and average the values.[/green]
Select T.CustomerId, Avg(1.0 * AVG_Fast_Slow_Days) As TheAverage
From   @Temp T
       Inner Join (
         Select   CustomerId, Min(RowId) As MinRowId
         From     @Temp
         Group By CustomerId
         ) As A
         On T.CustomerId = A.CustomerId
Where  RowId - MinRowId + 1 <= 3
Group By T.CustomerId

Copy/paste this to Query Analyzer. Push it, prod it, poke it with a stick. Once you understand how this works, adapt it to use your real table. If this works for you, and you would like me to explain any part of it, just let me know.


-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Thanks George, I will take a look at this and let you know how it goes. I was reading another article and it seems this solution may be over my head. I am not a DBA full time, I just wear this hat on occassion. I will see what I can derive out of your example regardless. I am pretty much pot commited now though, so what the heck right:)

Thanks,
Trent
 
>> I am not a DBA full time

Me neither.

I know this works with your sample data (I tested it). Seriously, if there's any part of it that you don't understand, let me know and I will explain it.

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
So am I able to do this on an existing table without actually modifying the table?

I did run your query that you posted and it does work like a champ, thanks alot.

I am interested to test this on my live data, but I don't want to alter the tables at all.

Also, once I do get this working on my live data, am I able to use this solution to create a crystal report with this data? A lot of what you wrote I have never seen before. I normally just write the queries to get the data and then create a report from that in crystal. I don't know how they use variables and temp tables and what not.

Thanks so much,
Trent
 
I'm not suggesting that you alter the tables. I am suggesting that you create a table variable. A table variable only exists for the duration of the query. After that, it's gone. This means, you don't need to modify your real tables at all.

I haven't used crystal reports in about 10 years. Having said that, I don't see any reason why you wouldn't be able to get this to work. If this were me, I would create a stored procedure that returns the data and then simply call the stored procedure from crystal reports.

Assuming I got your table name and column names correct...

Code:
Create Procedure GetCustomerWithTopThreeAverage
AS
SET NOCOUNT ON

-- Insert in to the table variable.  The order by at the end is very important.
Insert Into @Temp(CustomerId, Year_Invoiced, Month_Invoiced, AVG_Fast_Slow_days)
Select Customer_Id, Year_Invoiced, Month_Invoiced, AVG_Fast_Slow_days
From   Table1
Order By Customer_Id, Year_Invoiced DESC, Month_Invoiced Desc

-- Now, get the top 3 per customer and average the values.
Select T.CustomerId, Avg(1.0 * AVG_Fast_Slow_Days) As TheAverage
From   @Temp T
       Inner Join (
         Select   CustomerId, Min(RowId) As MinRowId
         From     @Temp
         Group By CustomerId
         ) As A
         On T.CustomerId = A.CustomerId
Where  RowId - MinRowId + 1 <= 3
Group By T.CustomerId

You should be able to copy/paste that code to a query analyzer window and run it. When you do, don't expect any data because this will create a stored procedure. To get the data, open another window and use this...

[tt][blue]Exec GetCustomerWithTopThreeAverage[/blue][/tt]

If this returns the data you are looking for, then all you need to do is modify your crystal report to call the stored procedure.

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
You are the man George! Your query works great and pulls the data correctly. Now I have to get back to "Discussing" with my ERP vendor why their numbers don't jive with these. They are beating around the bush with me. Between you and me, (and the rest of the board) I think they screwed up their calculations and are mad that I am calling them out on it. This calculation effects our relationships with our vendors/suppliers and enevitably our customers.

Thanks again George, Have a star..
 
Response from ERP vendor:
Average Fast/Slow Days
Average Fast/Slow Days are computed as follows:

Average Fast/Slow Days = Dollar Days/Dollars Paid
where
Dollar Days = Amount Paid x (Payment Date – Net Due Date)


For example:
Amount Paid = $100
Payment Date = 02/01/99
Net Due Date = 02/10/99

Dollar Days = 100 x (02/01/99 – 02/10/99) = 100 x -9 = -900

Tip! A negative value indicates that the customer pays early.

Amount Paid
Payment Date Net Due Date
Dollar Days

$100
-9
-900

$25
10
250

$50
1
50

Average Fast/Slow Days = Dollar Days/Dollars Paid =
OR
(-900 + 250 + 50) / (100 + 25 + 50) = -600 / 175 = -3.4286

*********************
I think I finally figured out how that number is derived.

So they are taking the average fast slow days that is in the table shown below for the screen shot that I sent them.

Month Total Paid | Avg_Fast_Slow_Days |Sum Payment X Days
5 3052.72 22.2020 67776.5100
4 5002.28 45.3204 226705.2000
3 .00 .0000 .0000

So there screen value for three months avg is (67776.5100 +226705.2000 + 0)/(3052.72 + 5002.28 + 0) = 36.5589 this is what their screen shows


However, I guess I don’t see why averaging the “averages” doesn’t work (22.2020 + 45.3204 + 0) / 3 = 22.5074666 this is the value that I am getting

I assume that their screen value is the accurate accounting value, I am not that educated on accounting terms so this baffles me.

When I take the average of the averages, the results end up the same in any other instance.

Either way, thanks for everyone for your help. I think I now have the information that I need to get this done. Well hopefully. Now I have to figure out how to total each of those fields and divide by the total of the others all while pulling the top 3! Looks like another day on the drawing board.

 
Well this is as close as I can get:
-- Create a table variable to help us with top 3
Declare @Temp Table(RowId Int Identity(1,1), Customer_Id Int, Year_Invoiced Int, Month_Invoiced Int, amount_Paid Int, sum_days_x_payment Int)

-- Insert in to the table variable. The order by at the end is very important.
Insert Into @Temp(Customer_Id, Year_Invoiced, Month_Invoiced, amount_paid, sum_days_x_payment)
Select Customer_Id, Year_Invoiced, Month_Invoiced, amount_paid, sum_days_x_payment
From table1
Where (year_invoiced = datepart(yyyy, getdate())AND month_invoiced <= (datepart(mm, getdate())-1)
OR Year_invoiced < datepart(yyyy, getdate()))
Order By Customer_Id, Year_Invoiced DESC, Month_Invoiced Desc

-- Now, get the top 3 per customer and average the values. This also protects against dividing by zero
Select T.Customer_Id, SUM(1.0000 * sum_days_x_payment)/
CASE
WHEN SUM(1 * amount_paid) = 0
THEN 1.0000
ELSE SUM(1.0000 * amount_paid)
END
As TheAverage
From @Temp T
Inner Join (
Select Customer_Id, Min(RowId) As MinRowId
From @Temp
Group By Customer_Id
) As A
On T.Customer_Id = A.Customer_Id
Where RowId - MinRowId + 1 <= 3
Group By T.Customer_Id

Thanks again George for all your help! I should be able to do what I have to do now with all of your great information.

Sincerely,
Trent
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top