Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Comparison of two datasets to retrieve the records which do not match 1

Status
Not open for further replies.

iren

Technical User
Mar 8, 2005
106
US
/*********SEE IF THEY MATCH***********/
/*I have a field id in dataset Score and field id in dataset Another.
I want to compare the value of field id from Score with field id from Another
and if some of them DO NOT MATCH i WANT TO KEEP ALL FIELDS from Myset that DO NOT MATCH*/

I understand how to retrieve those ones who match :
-------------------------------------------
proc sql;
create table My_result as
select *
from Score as MySet as M inner join Another as J
on M.id =J.id;
quit;
-----------------------------------------
However I have no idea how to retrieve those ones from MySet who do not match Another.

Could you please help me with this?

Thank You in advance,

Iren
 
So - do you want ALL the records regardless if there is a match or not --or-- just the ones from one table that do not match the other?

Table1
ID Value
1 Sticks
2 Straw
3 Bricks

Table2
ID PigNumber
1 1
2 2
3 3
4 4

So in the big bad wolf dataset (result) do you want
ID Value PigNumber
1 Sticks 1
2 Straw 2
3 Bricks 3
4 NULL 4

-or- just the last record?
 
Code:
data new
  merge score(in=in1)
        another(in=in2);
  by field_id;

  if in1 and not in2;
run;

Use the SAS Merge rather than SQL, it's a bit easier. Look up the syntax and options in the SAS Documentation.
 
Chris,

Thank You! This is exactly what I am looking for.

Since I am still a beginner in SAS as well as in SQL, I am just wondering if there any way to do it in Proc SQL instead?

Thank You,

Iren
 
Two ways to do it in SQL.
Code:
select *
from score
where field_id not in(select field_id
                      from another)
;
which can be useful if another is small. If it is very large though, this query can be inefficient.
Alternatively
Code:
select a.*
from score a
  outer join another b
on a.field_id = b.field_id
where b.field_id is null
;
which is more efficient than the first query if both tables are large. You can play around with them, try all 3 one after the other and the log will give you a breakdown of CPU and Real time used for each, decide which is quicker and use that.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top