Select in one table but not another 2

dja1 · May 30, 2002

Is there a more efficient alternative to the following;
SELECT colA FROM tableA WHERE colA NOT IN (SELECT colA FROM tableB) ?
In my case, tableA has 1.5M rows, and tableB has .5M rows, so the query takes quite a long time.

gregsimpson · May 30, 2002

What you have posted is effectively a correlated sub query and I expect it would take quite a long time to run as written.

I believe the query would run much more efficently using the EXCEPT operator.

Using the EXCEPT operator it is possible to find out which elements of a resulting set are not present in another answer set.

Below is an example where we wish to know which student ids in the student table are not present in the exam_taken table.

SELECT ID FROM STUDENT
EXCEPT
SELECT ID FROM EXAM_TAKEN

dja1 · May 30, 2002

Greg
Thanks very much - no wonder you're one of this forum's top experts!

DbaDan · May 30, 2002

Need more info How are Table A and B related? Is colA indexed in both tables? If it is indexed try coorelated subquery:

SELECT colA FROM tableA Z WHERE NOT exists (SELECT colA FROM tableB X where X.colA = Z.colA)

Any other indexed filtering criteria you can add will help.

dja1 · May 30, 2002

Thanks very much. I may have been a bit hasty in my awarding Greg a Star.

DbaDan's solution was the fastest, followed by my solution, and I'm afraid Greg's was quite sluggardly in comparison, but did show an admirable willingness to consult the relevant manual - and indeed, to quote verbatim from it.

"ColA" is the primary key in both tables.

gregsimpson · May 30, 2002

DJA1,

in a vain attempt to keep my star I have put in a lot of work to see exactly what is happening.

Using dynexpln I explained the statements

Results are

dja1 - 1019,468736 timerons
dbadan - 25305 timersons
me - 190312 timerons

I can post the access paths if you want to see them.

However having run the jobs I have found the following results.

$ time db2 < danfile >outfile

real 1m15.64s
user 0m3.72s
sys 0m11.81s
$ time db2 < dja1file > outfile

real 1m21.93s
user 0m3.58s
sys 0m10.43s
$
$ time db2 <gregfile >outfile

real 1m48.64s
user 0m4.17s
sys 0m11.33s

I did notice that running from the command prompt gave different results than running in batch as above. Still to the detriment of EXCEPT.

Running from the command prompt both yours and dbadans started to return records after 3 seconds, mine was after 12 seconds.

All I can say in my defence is a young lady at work used the EXCEPT operator today on her tables DB2 V7.2 on AIX and said it worked wonderfully. I would have to admit it appears to be the worst of the three here.

Have I done enough to recover the situation?

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Select in one table but not another 2

dja1

Programmer

gregsimpson

Programmer

dja1

Programmer

DbaDan

Technical User

dja1

Programmer

gregsimpson

Programmer

Similar threads

Part and Inventory Search

Sponsor