Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Subsetting data for subjects with X incidents within Y timeframe

Status
Not open for further replies.

CincinnatiKT

Programmer
Jul 12, 2013
1
0
0
US
Thanks in advance for the help.
If my question seems vague or confusing, please let me know.

I have data that is collected over time for patient level data with multiple measurements taken at each time. My interest is in finding the records with a set number of visits in a certain time frame. For example, less than or equal to 2 visits in 2 months.
• Patient V with 7 visits Jan 2, Jan 8, Apr 1, July 7, Oct 3, Dec 24, Dec 30 would have valid visits for Jan 2 and 8, Dec 24 and 30. The other dates for Patient V would not be in the output file.
• Another valid patient would be Patient W with 3 visits in 1 month. All 3 are in the output file
• Patient X with 2 visits in 35 days would be in the output file.
• Also, Patient Y with one visit Jan 2 and one visit June 15 and one visit June 30, would have entries for the June dates in the output file.
• Patient A with a visit in Jan 4 and one in June 4 would not be in the output dataset.
• Patient B with a visit in June 3 only would not be in the dataset.
I would prefer to make this with the time frame and number of visits as variables rather than constants so I can use the program for different time frames and counts.

After reading in the data my program currently is as follows:
proc sort data=work.readin;
by PATid visit_date;
run;

**calculate time difference between visits in seconds;
data work.Readin;
set work.Readin;
by PATid ;
retain R_visit_date;
r_visit_date = lag(visit_date);
if not first.PATid then do;
time_prevtothis= visit_date - r_visit_date;
end;
drop r_visit_date;
run;
* add a visit number to each observation;
data work.visitnumbered;
set work.Readin;
by PATid;
if first.PATid then visit_no = 1;
else visit_no + 1;
run;

***Just keep the records with the incidents of X in timeframe Y ;

**count the number of visits by PATid;
proc sort data= work.visitnumbered;
by PATid visit_date;
run;

/* To get the number of observations in each group of PATid, start */
/* a counter on the first observation of each BY-Group. The last */
/* observation in the BY-Group contains the total number of */
/* observations */

data work.visitnumbered1;
set work.visitnumbered ;
by PATid;
if first.PATid then visitcount=0;
visitcount+1;
if last.PATid then visitcounter=visitcount;
run;

proc means data=work.visitnumbered1 noprint nway;
class PATid;
var visitcount ;
output out=pat_sum
max=max_visitcount;
run;

proc sort data=work.pat_sum; by PATid;
proc sort data=work.visitnumbered1; by PATid visit_date;
data combine;
merge work.visitnumbered1 (in=inlist) pat_sum;
by PATid;
if inlist;
run;
proc sort data=work.combine; by PATid visit_date;
run;

data work.count_them;
set work.combine;
by PATid;
if max_transcount > 1 then output;
run;
proc sort data=work.count_them; by PATid descending visit_date;
run;

**indicator for qualified PATids;
data work.runningtot;
set work.count_them;
by PATid;
* this resets the running total to 0 at the start of an account;
if first.PATid then
do;
sumtime = 0;
cnter = 0;
end;
sumtime + time_prevtothis;
cnter + 1;
run;


STUCK HERE

I seem to be able to make it work for excluding the subjects with fewer than the number of observations needed but the timeframe window is where I am stuck. I am also thinking there may be an easier way to do all of this with fewer steps.

Thanks again for your help!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top