Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Subsetting data for subjects with X incidents within Y timeframe

Not open for further replies.


Jul 12, 2013
Thanks in advance for the help.
If my question seems vague or confusing, please let me know.

I have data that is collected over time for patient level data with multiple measurements taken at each time. My interest is in finding the records with a set number of visits in a certain time frame. For example, less than or equal to 2 visits in 2 months.
• Patient V with 7 visits Jan 2, Jan 8, Apr 1, July 7, Oct 3, Dec 24, Dec 30 would have valid visits for Jan 2 and 8, Dec 24 and 30. The other dates for Patient V would not be in the output file.
• Another valid patient would be Patient W with 3 visits in 1 month. All 3 are in the output file
• Patient X with 2 visits in 35 days would be in the output file.
• Also, Patient Y with one visit Jan 2 and one visit June 15 and one visit June 30, would have entries for the June dates in the output file.
• Patient A with a visit in Jan 4 and one in June 4 would not be in the output dataset.
• Patient B with a visit in June 3 only would not be in the dataset.
I would prefer to make this with the time frame and number of visits as variables rather than constants so I can use the program for different time frames and counts.

After reading in the data my program currently is as follows:
proc sort data=work.readin;
by PATid visit_date;

**calculate time difference between visits in seconds;
data work.Readin;
set work.Readin;
by PATid ;
retain R_visit_date;
r_visit_date = lag(visit_date);
if not first.PATid then do;
time_prevtothis= visit_date - r_visit_date;
drop r_visit_date;
* add a visit number to each observation;
data work.visitnumbered;
set work.Readin;
by PATid;
if first.PATid then visit_no = 1;
else visit_no + 1;

***Just keep the records with the incidents of X in timeframe Y ;

**count the number of visits by PATid;
proc sort data= work.visitnumbered;
by PATid visit_date;

/* To get the number of observations in each group of PATid, start */
/* a counter on the first observation of each BY-Group. The last */
/* observation in the BY-Group contains the total number of */
/* observations */

data work.visitnumbered1;
set work.visitnumbered ;
by PATid;
if first.PATid then visitcount=0;
if last.PATid then visitcounter=visitcount;

proc means data=work.visitnumbered1 noprint nway;
class PATid;
var visitcount ;
output out=pat_sum

proc sort data=work.pat_sum; by PATid;
proc sort data=work.visitnumbered1; by PATid visit_date;
data combine;
merge work.visitnumbered1 (in=inlist) pat_sum;
by PATid;
if inlist;
proc sort data=work.combine; by PATid visit_date;

data work.count_them;
set work.combine;
by PATid;
if max_transcount > 1 then output;
proc sort data=work.count_them; by PATid descending visit_date;

**indicator for qualified PATids;
data work.runningtot;
set work.count_them;
by PATid;
* this resets the running total to 0 at the start of an account;
if first.PATid then
sumtime = 0;
cnter = 0;
sumtime + time_prevtothis;
cnter + 1;


I seem to be able to make it work for excluding the subjects with fewer than the number of observations needed but the timeframe window is where I am stuck. I am also thinking there may be an easier way to do all of this with fewer steps.

Thanks again for your help!
Not open for further replies.

Part and Inventory Search

