Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

how to optimize this query?

Status
Not open for further replies.

hstijnen

Programmer
Nov 13, 2002
172
0
0
NL
Hi,

I've the following db:
Person(rid, name .....) containing some 10000 persons;
Pcategory(rid, cat) containing some 50000 records with categories for persons.

Now I want to select persons which have two specific categories 240 and 781. I use the following query:

select * from Person p
where p.rid in
(select rid from Pcategory where cat in (240,781)
group by rid having count(*) = 2)

This query takes a very long time (13 minutes). How can I optimize this query?

When I create a temporary table Tmprid(rid) and fill this table first:

insert into Tmprid
select rid from Pcategory where cat in (240,781)
group by rid having count(*) = 2

This takes no time (12 persons). Next:

select * from Person p
where p.rid in
(select rid from Tmprid)

This takes 2 seconds. Why is the first query so slow? I would think Interbase proceeds internally the same way as I did.

Anyone an idea?

Henk
 
Henk,

using the 'in' operator is often slow in SQL, and is probably the root of your problem, although I'm not sure why it should be so much slower than using the temporary table.
A solution could be to use a view in place of the temp table, and then join to the view (it would be better to use a join rather than 'in') i.e.

select p.*
from person p, pcatview v
where p.rid = v.rid

Maybe a better solution would be just to use a query without an in statement, such as

select p.*
from person p, pcategory c1, pcategory c2
where p.rid = c1.rid
and c1.cat = 240
and p.rid = c2.rid
and c2.cat = 781

You could try adding the following line to speed up selection of apt rows from the pcategory table, but I suspect it will be quicker without this line as the person table is smaller and hence it's better for the optimizer to use the person table as the basis for its queries

and c1.rid = c2.rid

HTH

Steve
 
Steve,

Thanks for your suggestions. The query is emmbedded in a CBuilder app. The user can select a number of categories and the program creates the sql statement and executes it. Now I've implemented my second solution: the table tmprid is dynamicly created, filled and afterwards destroyed. For now this works fast enough.

Regards, Henk
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top