Define "difference".
You could look at the actual file size as an indicator.
If you want to check to see if each field on each record is the same it'll take a bit more effort.
Two methods I can think of.
1 - Rename all the variables (other than the unique key) on one of the datasets, then join them together and test each field against it's partner from the other dataset.
Code:
proc sql;
create table dset_test as
select keyvar1
,keyvar2
,field1 as test_Field1
,field2 as test_Field2
...
from dest1
order by keyvar1
,keyvar2
;
quit;
data test;
merge dset_test(in=in1)
dset2(in=in2);
by keyvar1 keyvar2;
if in1 and not in2 then flag = "New Record";
else if in2 and not in1 then flag = "Deleted Record";
else do;
if test_field1 ne field1 then flag = "Changed";
else if test_field2 ne field2 then flag = "Changed";
...
end;
run;
That'll do it. Could be long winded if you have alot of variables.
Another method which might work is to interleave the 2 datasets after putting a fileID on each, then check for duplicates...
I think that there's a few logistical problems with this method though which may require a fair bit of code to circumvent, depending on whether or not you need to know exactly what the differences are.
Basically though, you could interleave the 2 datasets, sort them by the full record (other than the file ID) and dedupe them by this.
Then check for duplicates on the unique key. If you have any duplicates left, then it means that something is different.
Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.