Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Eliminate duplicates

Status
Not open for further replies.

qchac

Programmer
Nov 25, 2005
11
RO
Hello.

I have this table :
and I want to eliminate all the duplicate values and to preserve only the record with the most recent date_def or the weekest cli for each duplicate group.
-> For example, in this table, from the first duplicate (rows 1, 2) will remain only row 1.

What is the SAS solution for this ?

Thx in advance.
Raul
 
Hiya qchac - That's a nice easy one. I'm afraid I can't really see your pic very well, it's a bit small, however this is what I do when I'm deduping.
Code:
proc sort data=inpt;
  by key_var1 key_var2 date_ref;
run;

data outpt;
  set inpt;
  by key_var1 key_var2;

  if last.key_var2;
run;

Alternatively, and I prefer this method for the ability to to reporting afterwards and examin the duplicates, change the second step to:-
Code:
data outpt
     dupes;
  set inpt;
  by key_var1 key_var2;

  if last.key_var2 then output outpt;
  if not(first.key_var2 and last.key_var2) 
       then output dupes;
run;
This will put all the duplicates in a separate file (master records and slave records) which you can look at afterwards for data quality control).

If you have a different number of key variables (the variables that you use to determine uniqueness) just adjust the code accordingly, always use the last of the key variables as the one that you test.
For information on LAST. and FIRST., check out the SAS doco site, and that'll fill you in on the various things that you can do with it.
Enjoy.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top