Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

make this quicker ? 2

Status
Not open for further replies.

foxup

Programmer
Dec 14, 2010
328
CA
Hi,

I have a very simple query:

DELETE FROM REMTRAN WHERE remtran.custno NOT in (sele custno FROM client)

Just some quick details:

REMTRAN has 2.5 million records
CLIENT has 56,000

Geez, it's slow, and quicker way????

Please help.


Thanks,
FOXUP!
 
Have you tried just using Related tables and compared the resultant time?

Code:
SELECT Client
SET ORDER TO CustNo

SELECT RemTran
SET RELATION TO CustNo INTO Client
DELETE FOR EMPTY(Client.CustNo)

I don't know if it would be noticeably faster, but its worth a try.

Don't forget that after your DELETE (whichever method you end up using), you still will experience additional time involved if you run a PACK.

Good Luck,
JRB-Bldr

 
Hi

JRB has the basic idea.

First you need to be sure there is an index on custno that you can use.

If not, make one
Code:
select 0
use Client exclusive
index on custno tag client
select 0
use remtran exclusive
go top
do while .not. eof()
  select client
  set order to custno
  seek(remtran.custno)
  if !found()
    select remtran
    delete
  endif
  select remtran
  skip
enddo
select remtran
pack
messagebox("Done")




Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.
 
It's interesting, because my my boss had a problem with a similar sql to not delete anything at all.

This would be another way in sql:
DELETE remtran FROM remtran INNER JOIN Client on remtran.custno = client.custno Where client.custno IS NULL

@foxup: I tried both yours and my sql and both are slow even in a case of rather small tables with ~20000 child and ~2000 parent records.

JRB-Bldr's xbase way using relations is much faster. I actually don't see why this is the case. Rushmore sys(3054) says it's optimizing the join condition fully via an index on the primary key.

Anyway, I'd change his last line to DELETE FOR EOF("Client").

It's still awkward, as a select similar to Select * from remtran INNER JOIN Client on remtran.custno = client.custno or Select * from remtran Where remtran.custno in (Select custno from client) is really fast. It's of course selecting the records that should remain, so it doesn't really help. But it shows the culprit really is the NOT.

Bye, Olaf.
 
Another strange observation, you could confirm:

SELECT * FROM REMTRAN WHERE remtran.custno NOT in (sele custno FROM client)

So, simply changing from DELETing to SELECTing the records, is this running fast? For me it is.

Seems rushmore on DELETEs is not running the same optimisation as rushmore on SELECTs, and that really looks like a bug or at least a flaw of the sql engine. Guess we have to live with it.

Bye, Olaf.
 
Hi Foxup,

In addition to the excellent advice you've already been given, I would also check that you have an index on client.custno. It won't make any difference in your original code, but if you used Olaf's alternative SQL, it could make a significant difference.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips, training, consultancy
 
Mike,

I am measuring a very slow delete even though the index is there. Of course it would be even slower, if it wasn't there, but to my surprise and even though rushmore also reports to use the index even in the case of NOT IN (subselect) to make a join of the two tables, rushmore's not making the delete fast, but rushmore does make a fast select (!?). It's really strange.

Both the select and delte version with NOT IN (subselect) report the similar rushmore optimisation:

Rushmore optimization level for table "child": none
Rushmore optimization level for table "parent": none
Joining table "child" and table "parent" using index tag "id" (primarykey index)

Even though it's a subselect in the where clause, foxpros sql engine is making a join of it using the index. Still there is a difference: The select runs in split seconds, while the delete needs a few seconds, even in that order, that is even thought delete has the advantage of profiting from caching the data.

Bye, Olaf.
 
Oops

In addition to the advice above... my code has an error in it!

it should read:

set order to client

not

set order to custno

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.
 
Olaf,

That's interesting. It's not what I would expect, but what do I know?

Griff,

Your code looks like it should work, but I'm surprised that any solution that involves looping is faster than a single SQL statement. Still, it's worth testing the various possibilites.

Mike


__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips, training, consultancy
 
foxup,

to summarize:

1. An index on client.custno will surely help anyway, if you don't have one already. It's even nonoptional, when using xbase solutions of JRB-Bldr or Griff.

2. go the xbase way in this case, that is using a relation or seeks.

You hit a flaw of the sql engine, here. The rushmore optimisation fails on the sql delete, even on my alternative sql delete, which would need to be change to a left join besides, but that also doesn't change it's bad performance.

Bye, Olaf.
 
I can try this

sele client
index on custno tag one && Remove this line if you already have index on this field and set order to custno
sele remtran
BLANK FOR !seek(remtran.custno,'client')

I may not use dele & pack because it may save my time & blank record I can re-use it.
 
You're saying that you want to blank out the record rather than deleting it?

I doubt if that would save time (but it might be worth trying). You'd also have to factor in the overhead of keeping track of the blank records. Every time you need to insert a new record, you'd have to search for a blank one, which might take a long time, given the size of the table.

But the main objection is that it would probably take just as long (and perhaps longer) to blank the record than it would to delete it.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips, training, consultancy
 
newtofoxpro,

in general also a nice xbase way.
And of course pack only is available if you can get exclusive access to the client table.

I would prefer delete over blank anyway, as you have no side effects on sql, which blanked records may have. You can still RECALL to reuse deleted records.

Overall this problem points out a missing referntial integrity trigger either rejecting deletion of a client, if related records exist in remtran, or cascading the deletion of the client to subtables like remtran.

If you're using free tables instead of a dbc usage of referential integrity triggers is not possible of course but it would solve the deletion problem in making them unnecessary overall, as they'd be done (or prevented, dpending on how the ri rules are defined) automatic.

Bye, Olaf.
 
I may not use dele & pack because it may save my time & blank record I can re-use it. "

pack is time consuming and may not use in multiuser version

sele retran
index on custno tag one && Remove this line if you already have index on this field and set order to custno
go top
if empty(custno)
replace custno with m.custno
else
insert into retran (custno) values (m.custno)
endif
 
I would prefer delete over blank anyway, as you have no side effects on sql, which blanked records may have. You can still RECALL to reuse deleted records."

For myself, always, the # of record is going to ADD is more than blank. It is not possible for me, I am adding 100 customer per day and deleting 200 customer.

Anyway, this is additional idea which I use.
 
Before very long you won't have any customers NTFP

B-)

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.
 
It's true that Pack takes time, but that shouldn't really enter into the equation. The method you adopt for deleting the records doesn't affect the decision to pack or not to pack.

It would be perfectly valid to do the deletions and never pack the table. Whether that's more efficient in the long run depends on what other operations you perform on the table.

Regardless of that, I would advise you not to rush into the idea of blanking records and recyling them rather than deleting, for all the reasons discussed. But especially, you should not take that approach merely to avoid the overhead of packing.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips, training, consultancy
 
You say: "pack is time consuming and may not use in multiuser version."

Yes, but you can use delete without pack of course. Besides sql flaws, blanking values can also lead to hurting indexes and field rules. Finding a deleted row also is easy via locate for deleted(), while a blank record can not be identified by a single blank field alone.

It may be fine in your situation, but I'd not use balnk in general as deletion replacement. Your logic can also be used with DELETE FOR !seek(remtran.custno,'client') and will take the same time. It would be about the same thing as using a relation, but I'd bet a relation and checking for EOF() instead of calling SEEK for each record is faster.

Bye, Olaf.
 
I sometimes, not often, use a recycle technique.

If I have a situation where a parent record leads to a considerable quantity of children, and the user is likely
to want to change those children a lot. I sometimes commit the changes by deleting all the children, blanking the records and
then re-add the records by searching for the first deleted blank record and recalling it, then replacing the values - only adding 'real' new ones when the deleted ones run out.

The specific instance I'm thinking of is a workshop management system, where you have thousands of orders, tens of thousands of order items, each of which then has dozens of operations, all of which lead to literally millions of individual activities (grinding, polishing, cutting). If I just deleted and added new the table for the activities would be enormous.

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.
 
To give an impression with concrete timings:

In my case the difference between the slow DELETE-SQL and the fast SELECT-SQL is 20 seconds (!) vs 0.1 sec. And I'm only talking about 2000 and 20000 records in parent and child table, foxup has 50000 and millions!

I'd bet without further testing the differences of blank for seek() vs relation and delete vs seek & delete (each using the xbase DELETE command deleting the current record) are neglectible in comparison.

Bye, Olaf.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top