Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

moving average with date range 1

Status
Not open for further replies.

spc2941

Technical User
Sep 21, 2007
5
0
0
US
Hello all,

I'm new to SAS, and I am having some issues with calculating moving average based on dates and groupings within the dataset. Basically, I am trying to calculate the moving average for each observation based on the last 90 days. Each observation has a date. I also need to group them so that the moving average is only based on the group. In other words, if I was grouping it by fruits, apples would have it's only moving average and orages, etc.

I know I will need to sort the dataset first, then use a retain statement. I was actually thinking of doing a macro too. I started this, but can't seem to get any part of it to work. Can someone help me? I know how to do the sorting, but I also know how to get the average, but it's not grouping correctly. It is only giving me the value of the observation.

I then tried to do it with SQL, but it's not working either. this is what I came up with.

proc sql;
create table data.moving_avg as
select a.*, mean(basis) as moving
from xx
where date>date
group by product;
 

this doesn't work either.
data new;
set old;
by id;
retain basis;
if date+90 then
avg = mean(basis);
run;
 
Here is an example of my dataset.

product date basis
apple may20 4
orange march2 3
apple april 3
banana jan31 33
apple feb13 88
apple dec2 12

what i need is a running moving average of the last 90 days of by "product" based on the basis amount. sorry for the multiple post, but after i reread my earlier post, it seemed confusing as to what i am trying to do. thanks
 
spc2941,

I must start by saying thank you for this question. Because it gave me no rest until I could solve it. Let me state what I believe is your requirement just so that you and I are on the same page.

Req;
1) You want to have a moving average of the last 90 days of data by product name. Meaning, that for a single product (ex. apple) you're only interested in the basis-values of that product going back 90 days.
2) I assumed that you have one (1) record per product per day. That means you never have more than one basis value per product per day. (the code doesn't handle more right now)

To solve the history issue (values going back 90 days) I used the lagx() and put days 1-90 in variables and then added them to an ARRAY. Since the variable have to be included in the array I created a small macro that generated the lines of code that I needed.

Code:
*** MACRO THAT CREATES 270 VARIABLES FOR 90 DAYS OF HISTORY ***;
%macro SetVars;
  %do i=1 %to 90;
    prod&i   = lag&i(product);
    date&i   = lag&i(date);
    basis&i  = lag&i(basis); 
  %end;
%mend;

*** FIRST SORT BY PRODUCT AND DATE ***;
proc sort 
  data = yourdata
  out  = datasorted;
  by product date;
run;

data final;
  set datasorted;
  by product date;
  length prod1-prod90 $7
         date1-date90 
         basis1-basis90 
         mysum 8;
  array current{*} mysum;
  array prod{*} $ prod1-prod90;
  array dt{*} date1-date90;
  array bs{*} basis1-basis90;
  %SetVars;

  *** SET THE DATE 90 DAYS AGO TO CHECK ***;
  DateMinus90 = date-90;

  bctr  = 1;
  mysum = 0;

  *** LOOP THROUGH ALL POSSIBLE DAYS ***;
  do i=1 to 90;
    *** WRITE EXCEPTION FOR FIRST CASE IN PRODUCT GROUP AS THERE IS NO LAG YET ***;
    if first.product then
      movingavg = basis;
    /*** CHECK FOR SAME PRODUCT NAME AND WITHIN 90 DAYS ***/
    else
      if trim(upcase(prod{i})) = trim(upcase(product))  and 
         dt{i} > dateminus90 then do;
       /*** IF FOUND THEN ADD TO SUM VAR ***/
       mysum = sum(mysum,basis, bs{i});
       /*** KEEP A COUNT OF HOW MANY DATAPOINTS FOUND ***/
       bctr + 1;
     end;
  end;
  /*** IF NONE FOUND THEN THE CURRENT BASIS VALUE SHOULD BE USED ***/
  if mysum = 0 then
    myavg = basis;
  else 
    myavg = mysum/bctr;
run;
proc print data=FINAL;
  var product date  dateminus90 basis myavg;
  format date  dateminus90 mmddyy10.;
run;

I hope that this helps you. You really should look this over as I did this really without extensive checking. There may be many limitations to this code so beware.

Klaz
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top