Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Size of FPT file becoming very large 3

Status
Not open for further replies.

ManniB

Programmer
Nov 9, 2020
135
DE
Hi,

I've written a program which reads contents from one table to another table. The first table has only character fields, the second table has also memo fields to store texts larger than 254 characters. About 50% of the contents in the second table are stored in character fields, the other 50% in memo fields.

After executing the program the size of the resulting dbf file is 5 MB but the fpt file size is 1,8 GB (near to the 2GB limit!!), although the size of the fpt file is expected to be only about 5-6 MB (slighlty larger than the dbf file).

After issuing PACK or PACK MEMO, the size of the fpt file shrinks from 1,8 GB to 6 MB and everything works.

However, I'm concerned why the size of fpt file has grown so much in the first place. In the program there are never any records deleted in the table, so I don't even understand why PACK works wonders here. I'm also worried that next time, when I process more data, the 2GB limit is reached and I don't even get the chance to issue a PACK command.

What could be the reason for the fpt file for becoming so large?

Thank you very much in advance!

Manni
 
It is called memo bloat, when memos are written to the .fpt file VFP does not generally reuse the space taken
but just allocates new blocks and wastes the old ones. I am not sure how this applies to your exact process
but of records are being updated more than once the memo file will bloat.

Maybe you could do a memo pack during your process from time to time?

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Thank you Griff!

I was able to to narrow down the problem, and only one memo field is responsible for the increase in size.

Indeed, content was added to the memo field more than once with commands like these:

REPLACE mytable.mymemofield WITH text1
...
REPLACE mytable.mymemofield WITH text2 ADDITIVE
...
REPLACE mytable.mymemofield WITH text3 ADDITIVE


I supposed using REPLACE...ADDITIVE for memo fields is causing the memo bloat you mentioned!


Would it be a solution to use instead:

REPLACE mytable.mymemofield WITH text1
...
REPLACE mytable.mymemofield WITH mytable.mymemofield + text2
..
REPLACE mytable.mymemofield WITH mytable.mymemofield + text3

Or will VFP also allocate new blocks everytime the data in the memo field is changed?

So doing a PACK MEMO after every 100 or so records during the process would be the only solution?

Regards,

Manni



 
Manni,

I believe the size of the memo file can be affected by the block size. The default is 64. Changing it may help.

See the SET BLOCKSIZE command.

Steve
 
Neither method, (additive or otherwise) is going to help

I would be tempted to see if I could store the text in temporary tables and the combine near the end of the processing

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Instead of this:
[pre]
REPLACE mytable.mymemofield WITH text1
...
REPLACE mytable.mymemofield WITH mytable.mymemofield + text2
..
REPLACE mytable.mymemofield WITH mytable.mymemofield + text3[/pre]

do this:

[pre]lcMemo = text1
...
lcMemo = m.lcMemo + text2
...
lcMemo = m.lcMemo + text3
...
REPLACE mytable.mymemofield WITH m.lcMemo
[/pre]

That is, collect all the data into a variable and then store it to the memo field. That will an enormous impact on the size of the memo file.

FWIW, I wrote about this a long time ago:
Tamar
 
There you go Tamar put that very succinctly

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Why not the even easier:
Code:
REPLACE mytable.mymemofield WITH text1+text2+text3
?
 
Thanks Tamar, I will implement it that way.

The only thing I wonder about is, I assumed that memo bloat would only happen in shared mode. However, it also occured after opening the the table in exclusive mode (VFP 9 SP2).

 
Hi Dan,
because when collecting everything in a variabel it gives you a better overview and for example when the inclusion of text2 depends on an if-clause, you might not automatically want to add it.

 
Steve, thank you, judging from the interesting article Tamar posted, setting BLOCKSIZE to 0 might also help when the table is used exclusive.

 
Manni,

I haven't seen any specific guidance on optimizing the BLOCKSIZE for a certain
situation. Maybe it's a tradeoff between speed and bloat.

I'd be curious to know if setting it optimally (if possible) actually makes a noticeable difference.

Steve
 
Someone did a whole lot of testing around Blocksize way back when. Might have been Mac Rubel; if it was, then it would have been in FoxPro Advisor. But I'm really not sure who did it.

Tamar
 
I think the 'narrative' on this is here:


But I am too tired to explore it tonight

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Griff's link seemed like an extended discussion (debate?) on the workings of replacing a memo field with short and/or long data. A bit confusing to me. My conclusion is the only sure way for a particular situation is to EXPERIMENT.

Steve
 
Well, the solution is very clear: Only write once to the table and you get no bloat.
A memo field's block in the FPT file is only reused if a new value has the same length or becomes shorter.
As your text becomes longer and longer the FPT block can't be reused, so VFP reserves a new block and doesn't reuse the old one.

So the solution is to accumulate the text you want to store in a variable and only write it once into the memo field when it's complete.

Setting the blocksze means setting a minimum block for any memo content, which highers the chance of reusing it in a REPLACE or UPDATE. When the new value is longer but still below blocksize, that means no bloat, but setting blocksize high also means a single letter or word wastes a lot of space.

The only blocksize which gives the best size is blocksize 0 in combination with only writing each memo content once to get no bloat. Also, PACK MEMO minimizes the FPT size with blocksize 0 as no padding is done, but it also means when a memo edit makes it longer you get bloat.

The larger the table size is, the longer the PACK MEMO needs. So sometimes having a bit of padding space is still good to not get bloat when a memo value becomes longer. To not need PACK MEMO. If you only generate the table once and use it read only, the best size is using blocksize 0 and only writing each memo value once.

Chriss
 
Thank you Chris.

So I assume, when a memo field is often updated and most of the time content ADDED, then choosing a big blocksize beforehand is the best option to avoid memo bloat.

If tables where created already and you issue SET BLOCKSIZE 500 at the beginning of your application (in main.prg), does this affect the table when content is added to a memo field next time, or how do you change the blocksize of an already existing table?

 
how do you change the blocksize of an already existing table?

If I may jump in here, you can issue SET BLOCKSIZE for an existing table, but for the new blocksize to take effect, you must copy the table to a new table. You can then delete the original table and rename the new version back to the old.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
Thank you Mike.

Is the new blocksize then stored in the table header?

 
Manni,

yes, the blocksize is in the header - of the FPT file. The help describes the binary structure of many VFP file types, including .FPT.

I don't think changing it is a solution, though, because it may affect how old values are read, so you can't have multiple blocksizes as the DBF stores the offset into the FPT file and the start of a memo block structure has its length. I'm not in th details, but one or both of them are a factor of the blocksize. So changing that in an existing table would also need a rewrite of all the offsets in the DBF and/or length in the memo blocks themselves.

So a table works with the blocksize that's set during its definition.

Chriss
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top