Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

fill missing variables by cycle

Status
Not open for further replies.

gingerfish

Programmer
Jan 20, 2011
5
0
0
CH
Hi all,

I am on this for some time now and don't know how to solve it.
It is actually only a small part of a much bigger task, but I am stuck and cannot go further.

I have quite a large dataset (ds) with around 40 variables.
To every ID there are multiple cycles. In every cycle there are multiple parameters with date and day (derived from date).
I have extracted the day for each group of parameters (in my example H* and F*) and written in separate variables [h_day and f_day].
Now it gets tricky (at least for me):
Next step is to fill the missings of both variables per cycle.

Attached is a snapshot of some mock-up data. There you can see (in a very minimal way) the initial dataset I am using, what I get using "retain" and how it should look in the end.
Also not ideal is the sorting and merging. If I use "id cycle" or "id cycle parameter" that is not unique. However I can get rid of duplicates later on.

Any help is appreciated as I am just starting to code.

snapshot at:
Thanks in advance,
Gingerfish
 
I don't get the rule for filling the f* h* columns. With knowing that and knowing how to reverse order (or using a counter _n_ variable to sort reverse), it shouldn't be hard to retain the values to fill any missing. Can you let us known the rules?
 
From what I can make out the H_day and F_day columns refer to:

i) The cycle ie - A or B
ii) The Parameter ie H1 - F1 of cycle A

Therefore I as an example I guess:

If we use cycle A and Parameter H1

H_day = orginal_day and F_day = F1 of the same cycle.

But we'll need gingerfish to confirm...
 
Hi all,

I solved it! :-D

what I did is the following:
Code:
proc sort data=x1;
   by id cycle h_day;
run;
 
data x2 (drop = _fd);
   set x1;
   retain _fd;
   if not missing(f_day) then  _fd = f_day ;
   else f_day = _fd ;
run;
proc sort data = x2;
   by id cycle descending h_day;
run;
 
data x3 (drop = _hd);
   set x2;
   retain _hd;
   if not missing(h_day) then  _hd = h_day ;
   else h_day = _hd ;
run;
proc sort data = x3;
   by id date cycle;
run;
So I had to do two steps in stead of one, because I needed two kinds of sorting to fill the missings correctly per cycle.

I am sorry I did not explain it in detail in the first place. I was looking at that problem for hours and didn't see anything anymore ...

@jj72uk: yes, you were right. It is sorted by cycle.

Thanks very much, it really is appreciated!
gingefish
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top