Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Performing Comparisons between Observations 1

Status
Not open for further replies.

smalek

Programmer
Jan 15, 2009
28
CA
Hi
I have the following data set:
Pt# Visit# Time Type date
X 01 1212 Adm 13/04/2009
X 02 2128 Trnsfr 13/04/2009
X 03 2258 Trnsfr 17/04/2009
X 04 2300 Trnsfr 17/04/2009
X 05 1244 Disch 18/04/2009
Y 01 1303 Adm 11/04/2009
Y 02 1758 Trnsfr 11/04/2009
Y 03 712 Trnsfr 17/04/2009
Y 04 714 Trnsfr 17/04/2009
Y 05 1820 Disch 21/04/2009

I would like to compare the observations by Pt# such that if the time between 2 subsequent visits is less than 20min on the same day I would like to delete both occurences in order to end up with the following data set:

Pt# Visit# Time Type date
X 01 1212 Adm 13/04/2009
X 02 2128 Trnsfr 13/04/2009
X 05 1244 Disch 18/04/2009
Y 01 1303 Adm 11/04/2009
Y 02 1758 Trnsfr 11/04/2009
Y 05 1820 Disch 21/04/2009

Any thoughts on how can I approach this problem? Any advice is much appreciated.
Thanks
 
Try the LAG function. First sort your date by PT date and datetime. Then go through and flag all records that are with in 20 minutes of the record before. Re-Sort by pt date and decsending datetime and proceed with the same logic. This time use a flag2 to flag all related records. The idea being that you find all records that relate to their predecessor and any that meet the less than 20 mins flag. You still have to alert the record before that it has a flagged sibling, hence the resort.

Here is how I would do it
Code:
data test;
   length date $10;
   input pt $ visit $ time $ type $ date $;
   time = put(input(time,8.),z4.);
   time = trim(substr(time,1,2))||':'||trim(substr(time,3));
   ntime= input(time, time5.);
   ndate = input(date, ddmmyy10.);
   timestamp=dhms(ndate,hour(ntime),minute(ntime),0);
datalines;
X     01        1212    Adm     13/04/2009
X     02        2128    Trnsfr  13/04/2009
X     03        2258    Trnsfr  17/04/2009
X     04        2300    Trnsfr  17/04/2009
X     05        1244    Disch   18/04/2009
Y     01        1303    Adm     11/04/2009
Y     02        1758    Trnsfr  11/04/2009
Y     03        712     Trnsfr  17/04/2009
Y     04        714     Trnsfr  17/04/2009
Y     05        1820    Disch   21/04/2009
;
run;
proc sort 
   data = test
   out  = test1;
   by pt ndate timestamp;
run;
data test2;
  set test1;
  by pt ndate timestamp;
  lastrectime = lag(timestamp);
  if first.ndate then
     lastrectime=timestamp;

  *** SET YOUR 20 MINS WINDOW ***;
  visits_time_diff = (timestamp - lastrectime)/60;
  if 0 lt visits_time_diff lt 20 then
     del_flag = 1;
run;

proc sort 
   data = test2
   out  = test3;
   by pt ndate descending timestamp;
run;
data test4;
  set test3;
  by pt ndate descending timestamp;
  lastrectime = lag(timestamp);
  if first.ndate then
     lastrectime=timestamp;

  *** SET YOUR 20 MINS WINDOW ***;
  visits_time_diff2 = abs((timestamp - lastrectime)/60);
  if 0 lt visits_time_diff2 lt 20 then
     del_flag2 = 1;
  if del_flag eq 1 then delete;
  if del_flag2 = 1 then delete;
run;
proc sort 
  data = test4
  out  = final;
  by pt ndate timestamp;
  
run;
proc print data= final;
run;

Hope this helps you.
Klaz
 
WOW !!! Instant success. I never thought it would workout so easily. My sticking point was flagging the previous visits which you elegantly captured using the second proc sort.
Thanks Klaz for your solution.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top