Chris Miller
Programmer
Well, at least mostly in shared mode.
An idea to do PACK without exclusive access is to fill the gaps from deleted rows by moving undeleted rows into these gaps.
Eventually all deleted records are at the end of the DBF and it can be truncated. That still needs write access - exclusive access - to be able to adjust the reccount in the dbf header, writing an EOF byte and truncate the file. But these operations only need a short period exclusive access.
Anyway, presenting the first draft of the idea, that uses scatter/gather to move whole records:
This is programmed with least side effects in mind. Default path isn't changed to TEMP (would make some code shorter), currently selected workarea and DELETED mode are restored and indexes needed temporary are not generated in the CDX of the DBF but as separate IDX files, which eventually are erased from TEMP.
There are some limitations:
1. With A primary or candidate index copying records will cause violation of the uniqueness of the index. Big bummer, as every table should have a PK index.
2. The record copying doesn't work with autoinc fields. Another big bummer, as autoincs are quite often used for IDs. Could be avoided using other PK types, but limitation 1 still applies.
3. As SCATTER writes a copy of the memo field values, this won't be done by letting the record point to the same offset in the FPT as the source record, it will be done changing the FPT the same way as a REPLACE or UPDATE would, in the worst case causing memo bloat.
Number 1 is the only real problem 2 and 3 are restricting use of autoinc and getting rid of memo bloat can be done afterwards with PACK MEMO, size effects on CDXes are less often problematic, but could indicate using REINDEX. On the positive side: In a DBF with very few deleted rows, the rearranging of records will be much faster than PACK doing a complete dbf/fpt/cdx rewrite.
Limitations 2 and 3 could be overcome by not using SCATTER/GATHER, but instead using low level file operations FREAD and FRWITE to copy records. That leaves CDX and FPT as is. But that comes with another even nastier problem: CDX tags will point to the old record numbers, so indexes become wrong and need a REINDEX.
For sake of completeness the method to move undeleted rows from the end to the front of the DBF causes physical reordering. If that plays a role rethink your programming, tables should be treated as sets of data and physical order shouldn't play a role. And last not least: If the code fails to get a file handle for writing, it can still be considered an optimized table as seeks or locates start from top, so after shared packing even without truncation your LOCATEs and SEEKS should go faster again. Net undeleted data is the same with or without truncation.
And by the way: If you do recall and BLANK DEFAULT AUTOINC instead of append blank or inserts, you can recycle the space of deleted rows, too. In that solution autoincs are also no problem. But you only get to a DBF free of deleted rows in the long run, when finally the space of all deleted rows is reused.
Chriss
An idea to do PACK without exclusive access is to fill the gaps from deleted rows by moving undeleted rows into these gaps.
Eventually all deleted records are at the end of the DBF and it can be truncated. That still needs write access - exclusive access - to be able to adjust the reccount in the dbf header, writing an EOF byte and truncate the file. But these operations only need a short period exclusive access.
Anyway, presenting the first draft of the idea, that uses scatter/gather to move whole records:
Code:
Lparameters tcDBF
If Pcount()=0
tcDBF = Dbf()
Endif
Local lcDeleted, lcAlias
lcDeleted = Set("Deleted")
lcAlias = Alias()
Local lcSuffix, lnFirstDeleted, lnLastUndeleted
lcSuffix = Sys(2015)
Local lcAliasIsDeleted, lcAliasUnDeleted, ('loRecord'+lcSuffix)
lcAliasIsDeleted = 'Deleted'+lcSuffix
lcAliasUnDeleted = 'UnDeleted'+lcSuffix
Set Deleted Off
* Use the passed in DBF file twice for two record pointers on deleted and undeleted records
Use (tcDBF) In Select(lcAliasIsDeleted) Shared Again Alias (lcAliasIsDeleted)
Use (tcDBF) In Select(lcAliasUnDeleted) Shared Again Alias (lcAliasUnDeleted)
tcDBF = Dbf()
Erase (Getenv("TEMP")+"\IsDeleted.idx")
Erase (Getenv("TEMP")+"\UnDeleted.idx")
Select (lcAliasIsDeleted)
Index On Recno() To (Getenv("TEMP")+"\IsDeleted.idx") For Deleted() Additive
Locate
lnFirstDeleted = Recno()
Select (lcAliasUnDeleted)
Index On Reccount() - Recno() To (Getenv("TEMP")+"\UnDeleted.idx") For Not Deleted() Additive
Locate
lnLastUndeleted = Recno()
* While the first deleted record is before the last undeleted records, move that undeleted record to fill the gap:
Do While lnFirstDeleted<lnLastUndeleted And ;
Not Deleted(lcAliasUnDeleted) And Deleted(lcAliasIsDeleted) And ;
Not Eof(lcAliasUnDeleted) And Not Eof(lcAliasIsDeleted)
* copy undeleted row with scatter/gather
Scatter Name ('loRecord'+lcSuffix) Memo
Select (lcAliasIsDeleted)
Recall && causes next first deleted row to move to the top of IsDeleted.idx
Gather Name ('loRecord'+lcSuffix) Memo
Locate && positions on the row that's the first deleted row now.
lnFirstDeleted = Recno()
Select (lcAliasUnDeleted)
Delete && causes next last undeleted row to move to the top of UnDeleted.idx
Locate && position on the row, that's the last undeleted row now.
lnLastUndeleted = Recno()
Enddo
Set Order To 0
Set Index To
Set Deleted &lcDeleted
* post process: truncate file to remove all deleted records.
Local lnFile, lnLastByte
lnLastByte = Header()+lnLastUndeleted*Recsize()+1
Use In Select(lcAliasIsDeleted)
Use In Select(lcAliasUnDeleted)
Select 0
Erase (Getenv("TEMP")+"\IsDeleted.idx")
Erase (Getenv("TEMP")+"\UnDeleted.idx")
lnFile = Fopen(tcDBF,1) && buffered write access
If lnFile<0
* ? "can't truncate DBF"
Else
* change reccount in DBF header offset 4:
Fseek(lnFile, 4)
Fwrite(lnFile, Chr(Bitand(lnLastUndeleted,0xff)),1)
lnLastUndeleted = Bitrshift(lnLastUndeleted,8)
Fwrite(lnFile, Chr(Bitand(lnLastUndeleted,0xff)),1)
lnLastUndeleted = Bitrshift(lnLastUndeleted,8)
Fwrite(lnFile, Chr(Bitand(lnLastUndeleted,0xff)),1)
lnLastUndeleted = Bitrshift(lnLastUndeleted,8)
Fwrite(lnFile, Chr(Bitand(lnLastUndeleted,0xff)),1)
* change byte after last undeleted record to EOF (1A)
Fseek(lnFile, lnLastByte)
Fwrite(lnFile, Chr(0x1A),1)
* truncate file
Fchsize(lnFile, lnLastByte)
Fclose(lnFile)
Endif
If Not lcAlias==""
Select (lcAlias)
Endif
This is programmed with least side effects in mind. Default path isn't changed to TEMP (would make some code shorter), currently selected workarea and DELETED mode are restored and indexes needed temporary are not generated in the CDX of the DBF but as separate IDX files, which eventually are erased from TEMP.
There are some limitations:
1. With A primary or candidate index copying records will cause violation of the uniqueness of the index. Big bummer, as every table should have a PK index.
2. The record copying doesn't work with autoinc fields. Another big bummer, as autoincs are quite often used for IDs. Could be avoided using other PK types, but limitation 1 still applies.
3. As SCATTER writes a copy of the memo field values, this won't be done by letting the record point to the same offset in the FPT as the source record, it will be done changing the FPT the same way as a REPLACE or UPDATE would, in the worst case causing memo bloat.
Number 1 is the only real problem 2 and 3 are restricting use of autoinc and getting rid of memo bloat can be done afterwards with PACK MEMO, size effects on CDXes are less often problematic, but could indicate using REINDEX. On the positive side: In a DBF with very few deleted rows, the rearranging of records will be much faster than PACK doing a complete dbf/fpt/cdx rewrite.
Limitations 2 and 3 could be overcome by not using SCATTER/GATHER, but instead using low level file operations FREAD and FRWITE to copy records. That leaves CDX and FPT as is. But that comes with another even nastier problem: CDX tags will point to the old record numbers, so indexes become wrong and need a REINDEX.
For sake of completeness the method to move undeleted rows from the end to the front of the DBF causes physical reordering. If that plays a role rethink your programming, tables should be treated as sets of data and physical order shouldn't play a role. And last not least: If the code fails to get a file handle for writing, it can still be considered an optimized table as seeks or locates start from top, so after shared packing even without truncation your LOCATEs and SEEKS should go faster again. Net undeleted data is the same with or without truncation.
And by the way: If you do recall and BLANK DEFAULT AUTOINC instead of append blank or inserts, you can recycle the space of deleted rows, too. In that solution autoincs are also no problem. But you only get to a DBF free of deleted rows in the long run, when finally the space of all deleted rows is reused.
Chriss