Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Comparing 2 SAS data sets 3

Status
Not open for further replies.

cosmid

Programmer
Feb 14, 2008
73
US
Is there anyway to compare 2 SAS data sets just to see if there's any difference?
 
Define "difference".
You could look at the actual file size as an indicator.
If you want to check to see if each field on each record is the same it'll take a bit more effort.
Two methods I can think of.
1 - Rename all the variables (other than the unique key) on one of the datasets, then join them together and test each field against it's partner from the other dataset.
Code:
proc sql;
  create table dset_test as
  select keyvar1
        ,keyvar2
        ,field1  as test_Field1
        ,field2  as test_Field2
        ...
  from dest1
  order by keyvar1
          ,keyvar2
  ;
quit;

data test;
  merge dset_test(in=in1)
        dset2(in=in2);
  by keyvar1 keyvar2;

  if in1 and not in2 then flag = "New Record";
  else if in2 and not in1 then flag = "Deleted Record";
  else do;
    if test_field1 ne field1 then flag = "Changed";
    else if test_field2 ne field2 then flag = "Changed";
    ...
  end;
run;
That'll do it. Could be long winded if you have alot of variables.

Another method which might work is to interleave the 2 datasets after putting a fileID on each, then check for duplicates...
I think that there's a few logistical problems with this method though which may require a fair bit of code to circumvent, depending on whether or not you need to know exactly what the differences are.
Basically though, you could interleave the 2 datasets, sort them by the full record (other than the file ID) and dedupe them by this.
Then check for duplicates on the unique key. If you have any duplicates left, then it means that something is different.






Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
The simplest way to do this is to use proc compare. This will generate a report of all the differences between two datasets. You can either compare the two datasets as they are, or use an id variable for the comparrison.

The basic syntax is:

Code:
proc compare base = /* base dataset to compare against */
             compare = /* other dataset */ ;
id /* optional id statement */;
run;

For a more comprehensive overview check this out:
 
Nice one kdt82, I'd forgotten about that procedure.

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Thanks guys! Sorry about this late reply. I was working on a different project and got really busy. Thanks again for the help!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top