fill missing variables by cycle

gingerfish · Jan 20, 2011

Hi all,

I am on this for some time now and don't know how to solve it.
It is actually only a small part of a much bigger task, but I am stuck and cannot go further.

I have quite a large dataset (ds) with around 40 variables.
To every ID there are multiple cycles. In every cycle there are multiple parameters with date and day (derived from date).
I have extracted the day for each group of parameters (in my example H* and F*) and written in separate variables [h_day and f_day].
Now it gets tricky (at least for me):
Next step is to fill the missings of both variables per cycle.

Attached is a snapshot of some mock-up data. There you can see (in a very minimal way) the initial dataset I am using, what I get using "retain" and how it should look in the end.
Also not ideal is the sorting and merging. If I use "id cycle" or "id cycle parameter" that is not unique. However I can get rid of duplicates later on.

Any help is appreciated as I am just starting to code.

snapshot at:

http://img836.imageshack.us/i/tableexample.jpg/

Thanks in advance,
Gingerfish

MatthiasB · Jan 21, 2011

I don't get the rule for filling the f* h* columns. With knowing that and knowing how to reverse order (or using a counter _n_ variable to sort reverse), it shouldn't be hard to retain the values to fill any missing. Can you let us known the rules?

jj72uk · Jan 22, 2011

From what I can make out the H_day and F_day columns refer to:

i) The cycle ie - A or B
ii) The Parameter ie H1 - F1 of cycle A

Therefore I as an example I guess:

If we use cycle A and Parameter H1

H_day = orginal_day and F_day = F1 of the same cycle.

But we'll need gingerfish to confirm...

gingerfish · Jan 23, 2011

Hi all,

I solved it! :-D

what I did is the following:

Code:

proc sort data=x1;
   by id cycle h_day;
run;
 
data x2 (drop = _fd);
   set x1;
   retain _fd;
   if not missing(f_day) then  _fd = f_day ;
   else f_day = _fd ;
run;
proc sort data = x2;
   by id cycle descending h_day;
run;
 
data x3 (drop = _hd);
   set x2;
   retain _hd;
   if not missing(h_day) then  _hd = h_day ;
   else h_day = _hd ;
run;
proc sort data = x3;
   by id date cycle;
run;

So I had to do two steps in stead of one, because I needed two kinds of sorting to fill the missings correctly per cycle.

I am sorry I did not explain it in detail in the first place. I was looking at that problem for hours and didn't see anything anymore ...

@jj72uk: yes, you were right. It is sorted by cycle.

Thanks very much, it really is appreciated!
gingefish

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

fill missing variables by cycle

gingerfish

Programmer

MatthiasB

IS-IT--Management

jj72uk

Programmer

gingerfish

Programmer

Similar threads

Part and Inventory Search

Sponsor