Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

sql statement takes too long non-optimesed

Status
Not open for further replies.

KoenPiller

Programmer
Apr 4, 2001
270
0
0
NL
Hi,
I cant get this select statement optimized, it takes too long:

Code:
Select myCursor.productnmr, ;
       myCursor.startingdate, ;
       sum(myCursor.productive) as productive ;
  from myCursor ;
  join products on products.productnmr = myCursor.productnmr ;
 where myCursor.startingdate between(product.startingdate and product.endingdate) ;
  into cursor curProductiveactive

Nb products has an index on startingdate, endingdate and productnmr

Just hope on someone around who could help me with this one.

Thanks,

Koen
 

Koen,

Have you actually checked the optimisation level via SYS(3054)? At a quick glance, it looks to me like it should be fully optimised, but of course it might still take a long time, depending on the data volumes and the speed of the network. Have you any firm reason for thinking it is too slow?

A couple of points -- probably not relevant:

- I notice you are not doing any grouping, yet you have a SUM() function. This will lead to a result set with a single row, in which the product number and start dates columns will contain no meaningul information. If you have included those columns in your result set merely to use them in the join and WHERE clause - there's no need to do that. Just put the SUM() in the SELECT clause. But I doubt if that will affect the performance.

- I wonder if it would make any difference if you reversed the join condition. In other words:

myCursor.productnmr = products.productnmr

instead of

products.productnmr = myCursor.productnmr

I might be talking rubbish here (not the first time), but doesn't Rushmore depend in some way on the expression to the right of the "=" being optimisable?

Mike


__________________________________
Mike Lewis (Edinburgh, Scotland)

My sites:
Visual FoxPro (www.ml-consult.demon.co.uk)
Crystal Reports (www.ml-crystal.com)
 
I'am just asking if there is only typo or you have opened
some table with alias 'product' when you are tyinng that statement

So I thing,

between(product.startingdate and product.endingdate)

may by

between(products.startingdate and products.endingdate)

Zhavic

---------------------------------------------------------------
In the 1960s you needed the power of two Comodore64s to get a rocket to the moon. Now you need a machine which is a vast number of times more powerful just to run the most popular GUI.
 
Mike,

Yes, prior to posting I have checked the SYS(3054,12) :


Rushmore optimization level for intermediate result: none
Rushmore optimization level for table products: none
Joining intermediate result and table products using index tag Fk_49765

(sick)

reversing also no effect (even no rubbish)

I will hit for the weekend and return with fresh ideas on monday, believe that's the best.

Thanks for support,

Koen
 
Koen

Change:

where myCursor.startingdate between(product.startingdate and product.endingdate)

to

WHERE BETWEEN(myCursor.StartingDate,Product.StartingDate,.Product.EndingDate)

Regards,
Jim
 
Zhavic,

typo indeed, only one table product involved.

Jim,

sorry, can't check your suggestion at this moment, h/e will do 1st thing monday morning and ofcourse will let you know.

Thanks sofar,

Koen
 
Koen,

You don't say if you have an index on MyCursor.productnmr or myCursor.startingdate. I think if you did the select would be faster.

Hope that helps,

Stewart
PS If you want to get the best response to a question, please check out FAQ184-2483 first.
 

Stewart,

You don't say if you have an index on MyCursor.productnmr or myCursor.startingdate. I think if you did the select would be faster.

That's correct, of course. However, MyCursor sounds like it's a cursor rather than a physical table, in which case it was presumably created in a previous SELECT, in which case the cost of creating the index might outweigh the saving.

But you're right in that it explains why the query is not optimised.

Mike


__________________________________
Mike Lewis (Edinburgh, Scotland)

My sites:
Visual FoxPro (www.ml-consult.demon.co.uk)
Crystal Reports (www.ml-crystal.com)
 
Stewart,

Mike is correct, I am referring to a cursor, the index is on the products.dbf tag: products.startingdate.

However, for testing purpose I have executed the same logic to 'real' tables MyCursor.dbf and Products.dbf both indexed on startingdate and products.dbf on endingdate. Still I read "optimized none".

Also Jim's suggestion to use the FoxPro 'between' clause i/o the SQL 'between' no effect.

I believe I will have to live this one 'non optimized'

Koen
 
Hi Koen,

I always found it to be faster to put a join condition to the where clause. You may only need a join if you need a left join - expecting some myCursor.productnmr may not be found within the table products. Putting it in the where clause makes it an inner join, but optimizable by using the index on products.productnmr.

The sum() doesn't seem to make sense. Without a Group By you should have one record in the result set only, as SQL will be interpreting all myCursor as one single group.

Bye, Olaf.
 

Olaf,

I always found it to be faster to put a join condition to the where clause.

According to something I read (by a Microsoft employee), it doesn't make any difference performance-wise whether you use a WHERE or a JOIN for a join condition. Internally, they are both treated the same.

Mike


__________________________________
Mike Lewis (Edinburgh, Scotland)

My sites:
Visual FoxPro (www.ml-consult.demon.co.uk)
Crystal Reports (www.ml-crystal.com)
 
Hi Mike,

I've accelerated many SQL statements that way back in VFP6.

I also see a difference in VFP7. Run these 2 scripts that differ only in the final SQL. It's important to rebuild all data to not profit of caching, though.

The faster version (WHERE):
Code:
Create Cursor curOne (iID I, cText C(10))
Index On iID tag iID
Index On cText tag cText Additive
Create Cursor curTwo (iID I, fID I, cText C(10))
Index On iID tag iID
Index On cText tag cText Additive
Index On fID tag fID Additive
Local lnCnt
For lnCnt = 1 to 100000
   Insert into curOne values (lnCnt,Sys(2015))
EndFor lnCnt
For lnCnt = 1 to 999999
   Insert into curTwo values (lnCnt,Int(lnCnt/10)+1,Sys(2015))
EndFor lnCnt

Select * from curOne,curTwo;
where curTwo.fid = curOne.iid
Status bar messages of several runs tells me, SQL ran quite stable around 1,34 secs.

The slower version (LEFT JOIN)
Code:
Create Cursor curOne (iID I, cText C(10))
Index On iID tag iID
Index On cText tag cText Additive
Create Cursor curTwo (iID I, fID I, cText C(10))
Index On iID tag iID
Index On cText tag cText Additive
Index On fID tag fID Additive
Local lnCnt
For lnCnt = 1 to 100000
   Insert into curOne values (lnCnt,Sys(2015))
EndFor lnCnt
For lnCnt = 1 to 999999
   Insert into curTwo values (lnCnt,Int(lnCnt/10)+1,Sys(2015))
EndFor lnCnt

Select * from curOne;
Left join curTwo on curTwo.fid = curOne.iid
Status bar shows times of 1,84 secs and higher.

On a AMD XP 2000+, 512 MB RAM, Win XP Pro SP2 machine.

As said those timings are not just a run once test.

You may get other timings depending on your hardware. If you build tables instead of cursors the difference may be even clearer, but to not see the caching effect you need to restart VFP for each SQL-Select version.
Even if caching would effect the result, I tested WHERE first and still got faster results than with JOIN, although the JOIN should have gained performance from the cache. It's not a big factor, but WHERE seems to win over JOIN.

With VFP9 I only have a low time difference of those 2 Selects, true. So there has been an optimization of JOINS in VFP8 and 9.

Bye, Olaf.
 

Thanks for that, Olaf. I've never tried doing a timing test myself on this issue. You understand - I was only reporting what I read. I'll see if I can find it again, to check that I got it right.

Mike


__________________________________
Mike Lewis (Edinburgh, Scotland)

My sites:
Visual FoxPro (www.ml-consult.demon.co.uk)
Crystal Reports (www.ml-crystal.com)
 
But the two queries aren't the same. The second one is doing an outer join; it's not surprising that would take longer. To test equivalent joins, remove the LEFT keyword.

Tamar
 
Hi Tamar,

well, right, but I was optimizing LEFT joins that time by adding a record with id 0 to compensate the NULLs that would otherwise be joined and that still was faster with VFP6...

Bye, Olaf.
 
Hi OlafDoschke,

there is one problem in your example

you have to compare WHERE clause with INNER JOIN clause
instead of LEFT JOIN


Zhavic



---------------------------------------------------------------
In the 1960s you needed the power of two Comodore64s to get a rocket to the moon. Now you need a machine which is a vast number of times more powerful just to run the most popular GUI.
 

Olaf,

I just did the following test:

Table: Bigtable - half a million rows

Table: all_zips - 703 rows.

First test:

Code:
SELECT bt.fullname, bt.address, bt.city, bt.state, bt.zipcode, zp.comment ;
FROM bigtable bt ;
INNER JOIN all_zips zp ON bt.zipcode = zp.zipcode ;
INTO CURSOR test

Average time (over ten tests): 2.251 seconds


Second test:

Code:
SELECT bt.fullname, bt.address, bt.city, bt.state, bt.zipcode, zp.comment ;
FROM bigtable bt, all_zips zp WHERE bt.zipcode = zp.zipcode ;
INTO CURSOR test

Average time (over ten tests): 2.251 seconds


I think you'll agree that's pretty conclusive.

Mike



__________________________________
Mike Lewis (Edinburgh, Scotland)

My sites:
Visual FoxPro (www.ml-consult.demon.co.uk)
Crystal Reports (www.ml-crystal.com)
 
Hi Mike,

with which VFP version - 9?
I admitted twice - I only found the performance hit with VFP6 and 7.

Also in my example both tables are quite large.

zhavic:
putting the join condition in the WHERE clause is leaving it being an inner join, as the WHERE clause will filter only those pairs of records that match from both sides, so it's still an inner join, no problem. Simple example:

Code:
Create Cursor curTest (iID1 I)
Create Cursor curTest2 (iID2 I, iFID I)
Insert into curTest values (1)
Insert into curTest values (2)
Insert into curTest values (3)

Insert into curTest2 values (1,1)
Insert into curTest2 values (2,1)
Insert into curTest2 values (3,3)
Insert into curTest2 values (4,4)

Select * from curTest join curTest2 on curtest2.iFID = curTest.iID1

Select * from curTest inner join curTest2 on curtest2.iFID = curTest.iID1

Select * from curTest,curTest2 where curtest2.iFID = curTest.iID1

&& same results, WHERE clause equvalent to JOIN or INNER JOIN

That's why Tamar spoke about the unfair comparison. I just talked about a LEFT join as way back in time I optimized left joins that way. I took precautions to get the same result: Introducing records with ID 0 and referential integrity forbidding non existant foreign keys to name two precautions.

Bye, Olaf.
 

Olaf,

with which VFP version - 9?
I admitted twice - I only found the performance hit with VFP6 and 7.


I originally did the test in 9.0, but I have now also done it in 7.0 and 8.0 -- with the same physical files on the same computer (good thing I've got nothing better to do on a Friday evening).

The results are interesting:

Code:
[b]

                7.0       8.0        9.0

WHERE           1.382     2.160      2.251

JOIN            1.381     2.159      2.251
[/b]

All times in seconds; all based on an average of 10 timings, re-launching VFP after each test.

Based on those figures, VFP is getting slower (at least, for this particular test). But it confirms my assertion that there is no difference between WHERE and JOIN.

Mike


__________________________________
Mike Lewis (Edinburgh, Scotland)

My sites:
Visual FoxPro (www.ml-consult.demon.co.uk)
Crystal Reports (www.ml-crystal.com)
 
Hi Mike,

Well done, but using the same files means you did it with VFP9 tables. Create the tables or cursors with the each version and I'm sure you'll see a difference. I didn't make up my time measurements.

It would be interesting if you then still would find comparable results with WHERE and JOIN, then there seems to be something else having influence, eg hardware, perhaps even a german (localized) OS instead of the english version. (Just kidding).

Thanks for "wasting" your time, though :).

The rising time is confusing and worrying me. I assume you built the tables with VFP9 and so VFP9 should be able to get the best result. Or did you built them with VFP7 so each version should be able working with them? Be aware that VFP7 indexes were the biggest created in vfp version history. Maybe also the complexer SQL engine allowing more Select variants slows the execution down?

Bye, Olaf.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top