Comparison of two datasets to retrieve the records which do not match 1

iren · May 10, 2005

/*********SEE IF THEY MATCH***********/
/*I have a field id in dataset Score and field id in dataset Another.
I want to compare the value of field id from Score with field id from Another
and if some of them DO NOT MATCH i WANT TO KEEP ALL FIELDS from Myset that DO NOT MATCH*/

I understand how to retrieve those ones who match :
-------------------------------------------
proc sql;
create table My_result as
select *
from Score as MySet as M inner join Another as J
on M.id =J.id;
quit;
-----------------------------------------
However I have no idea how to retrieve those ones from MySet who do not match Another.

Could you please help me with this?

Thank You in advance,

Iren

jymm · May 10, 2005

So - do you want ALL the records regardless if there is a match or not --or-- just the ones from one table that do not match the other?

Table1
ID Value
1 Sticks
2 Straw
3 Bricks

Table2
ID PigNumber
1 1
2 2
3 3
4 4

So in the big bad wolf dataset (result) do you want
ID Value PigNumber
1 Sticks 1
2 Straw 2
3 Bricks 3
4 NULL 4

-or- just the last record?

ChrisW75 · May 10, 2005

Code:

data new
  merge score(in=in1)
        another(in=in2);
  by field_id;

  if in1 and not in2;
run;

Use the SAS Merge rather than SQL, it's a bit easier. Look up the syntax and options in the SAS Documentation.

http://v8doc.sas.com/sashtml/

iren · May 11, 2005

Chris,

Thank You! This is exactly what I am looking for.

Since I am still a beginner in SAS as well as in SQL, I am just wondering if there any way to do it in Proc SQL instead?

Thank You,

Iren

ChrisW75 · May 11, 2005

Two ways to do it in SQL.

Code:

select *
from score
where field_id not in(select field_id
                      from another)
;

which can be useful if another is small. If it is very large though, this query can be inefficient.
Alternatively

Code:

select a.*
from score a
  outer join another b
on a.field_id = b.field_id
where b.field_id is null
;

which is more efficient than the first query if both tables are large. You can play around with them, try all 3 one after the other and the log will give you a breakdown of CPU and Real time used for each, decide which is quicker and use that.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Comparison of two datasets to retrieve the records which do not match 1

iren

Technical User

jymm

Programmer

ChrisW75

Programmer

iren

Technical User

ChrisW75

Programmer

Similar threads

Part and Inventory Search

Sponsor