Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

DELETE Query very memory inefficient

Status
Not open for further replies.

LCoder

Programmer
Mar 29, 2004
7
DE
We are using Access 97 and have the following problem. We need to do a lot of single deletes on a table and do this by using a DAO.Querydef object and change the SQL code several times. We use plain SQL code, i.e. we set

qryDef.SQL = "DELETE * FROM table WHERE condition"
qryDef.Execute

We do this roughly 2000 times with different conditions. Durting this process the size of the database "explodes" from a normal 4MB to 62MB. We can compress the database and it shrinks back to 4MB again. What is going on here?
 
Just how the MS Access product works. Any actions, such as Delete, Insert, etc. are logged within system tables of MS Access.

Steve Medvid
"IT Consultant & Web Master"

Chester County, PA Residents
Please Show Your Support...
 
Steve
Just how the MS Access product works. Any actions, such as Delete, Insert, etc. are logged within system tables of MS Access.
Are you sure? I've never heard that. That's like logging, so why can't you roll-forward?

LCoder. Are you only deleting when the size grows?

I'm just wondering - do you create a new querydef for each delete. Bear in mind a querydef is a plan so will take up space in the database unless it is removed.

My final concern is how can you have 2000 conditions than are all so different you can't combine them to some degree. What sort of conditions are they?

 
Hi Mike and Steve,

thanks for your reply.

Steve: No, i am not creating a new querydef each time, but i reuse a "dummy" querydef, where I change the SQL statement each time. So this is probably not the reason for the increase in size.

The table that we do the deletes on holds cash flow data for trades. Each day approximately 2000 trades have changes that result in a change of the cash flows. The MSAccess DB generates this cash flows but before it can insert the new cash flows the old cashflows have to be deleted. This is the reason why we need so many delete queries.

My guess for the increase of the size is that after, say 10 single deletes the MSAccess DB automatically starts to optimize the table by regenerating the whole table (using new disc space) and not releasing the disc space that the old table used. I will try to somehow reduce the number of delete actions and see what happens.

If you have any further ideas i will be happy to hear from you. Thanks again.

Cheers
Lars
 
Each day approximately 2000 trades have changes that result in a change of the cash flows.
It looks like each of the 2000 deletes is actually of the same type. Why not eg put them into a table or file and run one SQL statement based on this table/file? This will be much quicker.

My guess for the increase of the size is that after, say 10 single deletes the MSAccess DB automatically starts to optimize the table by regenerating the whole table (using new disc space) and not releasing the disc space that the old table used.
Almost certainly it doesn't do that.

Two thousand deletions should lead to no change in size. Two thousand insertions obviously will increase the size depending on the amount of data per record. When you compact, all the space taken up by the 2000 deleted records is recovered and (essentially) given to the new records, so the database goes back to its original size.

 
I did some general testing with different ways of deleting data in a table and compared the size of the database before and after the delete process.

The test goes as follows:
step 1: insert 500 times a set of 10 records into a table. Each record each set has the same identifier
step 2: delete the 10 records that belong to one identifier and immediately insert them again. Do this for all 500 identifier.

Theoretically the size of the database after step_1 and step_2 should be equal. In realty this is only the case after compressing the database.

Interesting is now, that the size of the database (before compressing) depends on the way the delete operation is done. I tried 3 ways:
1.) use the SQL DELETE statement for each identifier (i.e. 500 DELETE calls)
2.) Initially delete all records from table (i.e. 1 DELETE call)
3.) use the delete function from the recordSet object (DAO.recordSet.delete)

I did step_1 and compressed the database. After this the initial size of the DB is 600 KB. Here are the sizes of the DB after step_2 for the different delete methods:
1.) 3,650 KB
2.) 1,204 KB
3.) 802 KB

I did step 2 again and found
1.) 7,200 KB
2.) 1,812 KB
3.) 898 KB

After compressing the DB the size went back to 600 KB in all cases.

A very stunning observation is that when I use a DB with a completely empty table and do the delete operations with methode 1.) (i.e. the 500 SQL DELETE statements) the size of the DB increases to 3650 KB ALTHOUGH THERE WAS NOTHING TO BE DELETED. I guess this is a bug in Access. Has anybody heard of this?
 
Can you say that again slowly? I'm getting confused with the different 1s and 2s.
step 1: insert 500 times a set of 10 records into a table. Each record each set has the same identifier
If each record has the same identifier don't you get lots of rejections becuase of duplicates?

 
Sorry Mike, you are right. The way I wrote it the things are not so easy to understand.

We have a table (lets call it "cashflows") with the following four fields:
TID: TradeID
date: date of cashflow
ccy: currency
amount

The primary key is the combination of TID+date+ccy. This is necessary because one trade can have multiple cashflows in multiple currencies but has only one TID. What my test does is the following. It generates a trade with TID = "trade_x" (where x is a number between 0 and 500) and for each trade it inserts 10 cashflows into the "cashflows" table. In step_2 I delete these 10 cashflows (e.g. via SQL "DELETE * FROM cashflows WHERE TID = 'trade_x') and reinserts them afterwards.

Hope this clarifies what I am doing.
 
Hi

A guess here, I think the increase in size is being caused by the changing of the SQL in the QueryDef

Have you tried either

DoCmd.RunSQL "DELETE ..."

or Db.Execute "DELETE ..."

and monitored the db size

Regards

Ken Reay
Freelance Solutions Developer
Boldon Information Systems Ltd
Website needs upgrading, but for now - UK
 
LCoder

Thanks for the clarification. I'll need to read that on the train. (Also need to get through another episode of 24 - did those computing pioneers realise that all their work would ultimately be used for viewing films on trains?)

 
What I think you're saying is you add 5000 records and compact the database. The mdb size is now 600kb.

You then delete 5000 records. I would expect at this point no change in size. Jet would mark all the 5000 records in the data pages, and entries in the index pages, as deleted but not add or delet any data. So the size should be 600kb.

Then you add the 5000 records back in again. I would expect all the 5000 records to be put at the end of the existing tablespace ie Jet does not re-use rows marked as deleted. As for trhe index pages I would expect Jet to attempt to add each new index entry only to find there is already one there so it would change the existing record to undeleted and update its pointer, so no change in space taken up by the index pages.

If we look at your figures, we note your size firstly grows by 202Kb (min). This suggests the data is about 35-40 bytes per record. However when you repeat the process it only grows by 96 kb. That is a complete mystery. Why do alternative methods add 3.6MB? I just cannot see what could be going on here.

Worth posting these results on the Microsoft site.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top