Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Average of variable

Status
Not open for further replies.

rzaa

Programmer
May 19, 2008
2
Hi,

I have the following data:

Tick Date1 Date2 Cat. value
A x 12 a1 0.23
A x 15 a3 0.75
A x 19 a1 0.65
A x 22 a6 0.16
A x 25 a3 0.87
A x 30 a4 0.65

Data is sorted by Tick Date1 Date2.

I need to calculate an accumulating average of the variable value. Such that the second observation would be (0.23+0.75)/2. The tricky part is that each Cat. value can only count once in the average. Thus the average in the third period should only include the most recent Cat. value (0.75+0.65)/2 not (0.23+0.75+0.65)/3.

The average for the last obs should only include 4 observations the two most recent in cat. a1 and a3 and the then a6 and a4. (0.65+0.16+0.87+0.65)/4.

I made the simple average where all observations within each group of Tick and Date1 is included:

if first.Date1 then do;
var1 = 1;
Var2=value;
end;
else do;
Var1=Var1+1;
Var2=Var2+value;
end;

The average is then simply Var3=Var2/Var1.

I have struggling to find a way to exclude the oldest values(in terms of date2) from the average.

Any help would really be appreciated.

Thanks!

Rasmus
 
The "retain" statement will probably help you out here.
Code:
  retain last_value;

  running_average = mean(value,last_value);
  last_value = value;

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Hi,

This is the entire code I use to create the running average including all values by a certain date (date1).
As you see from below I use the retain statement.

As far as I can see the running_average suggested would alwals include the two most recent values, I am looking for an average of the most recent values in each category. (there could be 100 different categories for some date1 value that should be included in the average, but each category may only be included once)

Do you have any suggestion to how this could be done?

Rasmus

DATA test;
set test;
by tick date1;
retain var1 var2;
if first.date1 then do;
var1 = 1;
var2=value;
end;
else do;
var1=var1+1;
var2=var2+value;
end;
run;
 
hi
have you tried a proc summary or proc mean?

e.eg.

proc summary data = test nway;
class tick date;
var value;
output out = test1 mean=;
run;

*using retain command;

data test1;
set test;
by tick date1;
if first.date1 then do;
var1=0;
var2=0;
end;
var1 = var1 +1;
var2=value +var2 ;
retain var2 var1;
run;

cheers
hope it helps
 
Hi Rzaa,

I think creating a lookup table of the most recent values may be what you are looking for - this would make it more flexible when you are dealing with large numbers of records. If you have SAS 9 you can use hash tables. HTH

Code:
data have;
input Tick$ Date1 Date2 Cat$ value;
cards; 
Ax 1 2 a1 0.23
Ax 1 5 a3 0.75
Ax 1 9 a1 0.65
Ax 2 2 a6 0.16
Ax 2 5 a3 0.87
Ax 3 0 a4 0.65
;
run;


data want(drop=rc tempval);
/* Initialise and define hash object */
    if _N_ = 1 then do;
        declare hash h();
        h.defineKey('cat');
        h.defineData('cat','value');
        h.defineDone();
    /* Create a hash iterator */ 
        declare hiter iter("h");
    end;

    set test;
    /* Reset accumulator variable to zero */
    tempval=0;
    /* If key exists in hash replace it */
    h.replace();
    
    /* Loop over the hash object to get total */
    /* of values */
    rc=iter.first();
    do while (rc=0);
        tempval+value;
        rc=iter.next();
    end;
    
    /* Calculate running average */
    runningAverage=tempval/h.num_items;
run;


 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top