Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Exactly what is in that 4-byte memo field in the DBF?

Databases and tables

Exactly what is in that 4-byte memo field in the DBF?

by  rlawrence  Posted    (Edited  )
Recently, I have had to deal with another corrupt memo file. This was for my own data--rather than a clients, and I wasn't ready to give up on it. Like many of you, I have developed a number of tricks to recover from "Memo file is missing or invalid." error messages. This particular problem was a bloated memo file (~1.5Gb) that caused a fatal error when I attempted to browse through my records. I was tired of feeling helpless every time something happened to a memo file; so, I decided to dig deeper.

I began by opening the FPT and DBF files with Hexedit. You can certainly see the content in each of these files. The following articles on MSDN gave me a pretty good idea of what to look for:

Structure of the DBF file: http://msdn.microsoft.com/en-us/library/st4a0s68(v=VS.71).aspx

Structure of the FPT file: http://msdn.microsoft.com/en-us/library/8599s21w(v=VS.71).aspx

As I searched through Hexedit, I could make out the content of the 4-byte memo fields in my record. I realize that these are pointers of some sort into the FPT file, but there is nothing in these articles that tells you what these 4 bytes actually hold. There is a vague reference in the DBF article that states that "Integers in table files are stored with the least significant byte first." It turns out that the opposite is true in the FPT file. (Go figure.) That at least tipped me off to reversing my bytes when looking at the memo field contents.

To make things easier to see, I wrote a little program to dump the content of the memo fields in my table. Here's a sample of the output:

Code:
Opening Table file: C:\USERS\PUBLIC\PADATA\PUBASSIST.COM\CONTACT.DBF.File Handle:  9
File Size:  372981  0x0005B0F5
Record#:    1    PO_ADDR: 01 71 EA 51     COUR_ADDR: 00 00 00 00     EMAIL: 01 71 EA 53     WEB_URL: 00 00 00 00     WEBSERVICE: 00 00 00 00     BIOGRAPHY: 00 00 00 00     PORTRAIT: 00 00 00 00     COMMENT: 00 00 00 00     
Record#:    2    PO_ADDR: 01 71 EA 58     COUR_ADDR: 00 00 00 00     EMAIL: 01 71 EA 59     WEB_URL: 00 00 00 00     WEBSERVICE: 00 00 00 00     BIOGRAPHY: 01 71 EA 5A     PORTRAIT: 01 71 EA 5F     COMMENT: 00 00 00 00     
Record#:    3    PO_ADDR: 01 71 EA 60     COUR_ADDR: 00 00 00 00     EMAIL: 01 71 EA 61     WEB_URL: 00 00 00 00     WEBSERVICE: 01 71 EA 62     BIOGRAPHY: 01 71 EA 63     PORTRAIT: 01 71 EA 71     COMMENT: 00 00 00 00     
Record#:    4    PO_ADDR: 00 00 00 00     COUR_ADDR: 00 00 00 00     EMAIL: 00 00 00 00     WEB_URL: 00 00 00 00     WEBSERVICE: 00 00 00 00     BIOGRAPHY: 00 00 00 00     PORTRAIT: 00 00 00 00     COMMENT: 00 00 00 00     ...


On thing you can readily see is that an empty memo field results in all zeros in those 4 bytes. To make things even more visible, I edited another portion of my output to remove the empty memo fields. Here is a sample:

Code:
.
.
.
Record#:    8    PO_ADDR: 00 00 10 5A     EMAIL: 00 00 14 F9     
Record#:    9    PO_ADDR: 00 00 10 5B     EMAIL: 00 00 14 FA     
Record#:   10    PO_ADDR: 00 00 10 5C     EMAIL: 00 00 14 FB     
Record#:   11    PO_ADDR: 00 00 10 5D     EMAIL: 00 00 14 FC     
Record#:   12    PO_ADDR: 00 00 10 5E     EMAIL: 00 00 00 00     
Record#:   13    PO_ADDR: 00 00 10 5F     EMAIL: 00 00 14 FD     
Record#:   14    PO_ADDR: 00 00 10 60     EMAIL: 00 00 00 00     
Record#:   15    PO_ADDR: 00 00 10 61     EMAIL: 00 00 14 FE     
Record#:   16    PO_ADDR: 00 00 10 62     EMAIL: 00 00 14 FF     
Record#:   17    PO_ADDR: 00 00 10 63     EMAIL: 00 00 15 00     
Record#:   18    PO_ADDR: 00 00 10 64     EMAIL: 00 00 00 00     
Record#:   19    PO_ADDR: 00 00 10 65     EMAIL: 00 00 15 01     
Record#:   20    PO_ADDR: 00 00 10 66     EMAIL: 00 00 15 02
.
.
.

I wrote a similar program that will walk through the blocks of an FPT file. The output looks something like this:

Code:
Opening memo file: C:\USERS\PUBLIC\PADATA\PUBASSIST.COM\CONTACT.FPT.
File Handle:  21
Memo File Size:  1551539328  0x5C7A9C80

***** Header Details *********

Byte offset 0 - 3: Location of the next free block:  1 71 EA 72 
Byte offset 4 - 5: Unused
Byte offset 6 - 7:  Block Size (bytes per block):  64,  0x040
Byte offset 8 - 511: Unused

***** Memo Block Data ********

Address: 512, 0x00000200    Block Type: 1,  0x0001    Memo Length:   17,  0x00011        Content: 2720 Annelise Wayist.com/admin/ 
Address: 576, 0x00000240    Block Type: 1,  0x0001    Memo Length:   27,  0x0001B        Content: liz@entangledpublishing.com
Address: 640, 0x00000280    Block Type: 1,  0x0001    Memo Length:   19,  0x00013        Content: 41 Lawrence Heights
.
.
.


What I found, to my surprise, was that the memo pointers in the DBF table were sequential. There were some that were out of order, but it is common to find sequential numbers in these memo pointers in records that are close to each other. The question is, what does this number refer to? It can't be a simple byte offset. Sequential values would point to locations only 1 byte apart!

Well, it turns out that the pointer refers to the "block" in the memo file.

By default, memo field values are stored in 64-byte blocks in the FPT file. You can change this with the SET BLOCKSIZE command. So, multiplying the value found in the 4-byte memo pointer by the block size will yeild the offset of the first memo block for the field's value. For example, from my sample above:

0x105A * 0x40 yields an offset of 0x41680 for the PO_ADDR field in record #8.

0x105B * 0x40 yields an offset of 0x416C0 for the PO_ADDR field in record #9.

I can verify these offsets by looking at the output from my dump of the FPT file.

For others who may want to continue down this path, here is my procedure for dumping the content of the 4-byte memo field pointers in the DBF. This procedure is specific to my particular table structure, so it won't work for another table without changes. Use the offsets for the memo fields in your own table. Hopefully, it will still provide you with a clue about how to approach this yourself:


Code:
*	This procedure prints out the memo field locations in the Contact Table.

CLEAR 
CLOSE TABLES

* Locate the DBF file.
cFilePath = GETFILE("DBF", "Table File:", "Open", 0)
IF EMPTY(cFilePath)
	WAIT "No file was selected." WINDOW NOWAIT
	RETURN
ENDIF

* Create a log file.
cLogPath = JUSTSTEM(cFilePath) + [_DBF.LOG]
SET TEXTMERGE TO (cLogPath)
SET TEXTMERGE ON 

* Open the file.
\Opening Table file: <<cFilePath>>.
nHandle = FOPEN(cFilePath)
IF nHandle < 0
	\Can't open table file.
	SET TEXTMERGE OFF
	SET TEXTMERGE TO 
	RETURN
ENDIF
\File Handle:  <<nHandle>>

* Check the file size.
nSize = FSEEK(nHandle, 0, 2)
\File Size:  <<ALLTRIM(STR(nSize))>>  <<TRANSFORM(nSize, "@0")>>
IF nSize <= 0
	\"File is empty."
	=FCLOSE(nHandle)
	SET TEXTMERGE OFF
	SET TEXTMERGE TO 
	RETURN
ENDIF
=FSEEK(nHandle, 0, 0)

* Obtain the record size
nRecLen = 0x0199

* The position of the first record.
nFirstRec = 0x0648
cStr = FREAD(nHandle, nFirstRec)

* Read each record.
lEOF = .F.
i = 1
DO WHILE !lEOF

	* Read the reoord.
	cStr = FREAD(nHandle, nRecLen)

	\Record#: <<STR(i, 4)>>	

	* Pull the memo field locations from the record.
	************************************************
	* The following offsets are specific to the 
	* contact table.  If another table is to be read
	* replace the following lines with the offsets
	* and field names specific to that table.
	************************************************
	nOffset = 0x0090 + 1
	nLocation = str2hex(SUBSTR(cStr, nOffset, 4))
	\\PO_ADDR: <<nLocation>>	
	
	nOffset = 0x0094 + 1
	nLocation = str2hex(SUBSTR(cStr, nOffset, 4))
	\\COUR_ADDR: <<nLocation>>	
	
	nOffset = 0x0108 + 1
	nLocation = str2hex(SUBSTR(cStr, nOffset, 4))
	\\EMAIL: <<nLocation>>	

	nOffset = 0x010C + 1
	nLocation = str2hex(SUBSTR(cStr, nOffset, 4))
	\\WEB_URL: <<nLocation>>	

	nOffset = 0x0110 + 1
	nLocation = str2hex(SUBSTR(cStr, nOffset, 4))
	\\WEBSERVICE: <<nLocation>>	

	nOffset = 0x0166 + 1
	nLocation = str2hex(SUBSTR(cStr, nOffset, 4))
	\\BIOGRAPHY: <<nLocation>>	

	nOffset = 0x016A + 1
	nLocation = str2hex(SUBSTR(cStr, nOffset, 4))
	\\PORTRAIT: <<nLocation>>	

	nOffset = 0x016E + 1
	nLocation = str2hex(SUBSTR(cStr, nOffset, 4))
	\\COMMENT: <<nLocation>>	
	************************************************

	i = i + 1
	nOffset = FSEEK(nHandle, 0, 1)
	lEOF = (nOffset >= nSize)
ENDDO

* Close and exit.
SET TEXTMERGE OFF
SET TEXTMERGE TO 
=FCLOSE(nHandle)
MODIFY FILE (cLogPath) NOWAIT



*..............................................................................
*   Function: DEC2BASX
*    Purpose:  Convert whole number 0-?, to base 2-16 
*
* Parameters: nTempNum - number to convert (0-9007199254740992)
*             base    - base to convert to i.e., 2 4 8 16...
*    returns: string
*      Usage:  cresult=Dec2BasX(nParm1, nParm2)
*              STORE Dec2BasX(255, 16) TO cMyString  &&... cMyString contains 'ff'
*
* Taken from: http://www.tek-tips.com/faqs.cfm?fid=4461
*..............................................................................
FUNCTION dec2basx
PARAMETERS nTempNum, nNewBase

STORE 0 TO nWorkVal,;
   remainder,;
   dividend,;
   nextnum,;
   digit

nWorkVal = nTempNum  
ret_str = ''

DO WHILE .T.
   digit = MOD(nWorkVal, nNewBase)
   dividend = nWorkVal / nNewBase
   nWorkVal = INT(dividend)

   DO CASE
      CASE digit = 10
         ret_str = 'a' + ret_str
      CASE digit = 11
         ret_str = 'b' + ret_str
      CASE digit = 12
         ret_str = 'c' + ret_str
      CASE digit = 13
         ret_str = 'd' + ret_str
      CASE digit = 14
         ret_str = 'e' + ret_str
      CASE digit = 15
         ret_str = 'f' + ret_str
      OTHERWISE
         ret_str = LTRIM(STR(digit)) + ret_str
   ENDCASE

   IF nWorkVal = 0
      EXIT
   ENDIF ( nWorkVal = 0 )
ENDDO ( .T. )
RETURN ret_str
*: eof dec2basx


FUNCTION Str2Hex
PARAMETERS cStr
LOCAL i, nLen, cByte, cHex

*	Converts the contents of the submitted string to Hexidecimal format.

	nLen = LEN(cStr)
	cHex = ""
	FOR i = nLen TO 1 STEP -1		&& Least significant digits are first.
		cByte = SUBSTR(cStr, i, 1)
		cHex = cHex + PADL(UPPER(Dec2BasX(ASC(cByte), 16)), 2, "0") + " "
	ENDFOR 
	RETURN cHex
ENDFUNC


And here is my routine that dumps the contents of the memo (FPT) file:

Code:
*	READ_MEMO.PRG - This procedure attempts to read the segments of a VFP7 Memo file.  It assumes 
*			the structure published in the following MSDN Article:

*		http://msdn.microsoft.com/en-us/library/8599s21w(v=VS.71).aspx

CLEAR 

* Locate the memo file.
cFilePath = GETFILE("FPT", "Memo File:", "Open", 0)
IF EMPTY(cFilePath)
	WAIT "No Memo file was selected." WINDOW NOWAIT
	RETURN
ENDIF

* Create a log file.
cLogPath = JUSTSTEM(cFilePath) + [_FPT.LOG]
SET TEXTMERGE TO (cLogPath)
SET TEXTMERGE ON 

* Open the file.
\Opening memo file: <<cFilePath>>.
nHandle = FOPEN(cFilePath)
IF nHandle < 0
	\Can't open Memo file.
	SET TEXTMERGE OFF
	SET TEXTMERGE TO 
	RETURN
ENDIF
\File Handle:  <<nHandle>>

* Check the file size.
nSize = FSEEK(nHandle, 0, 2)
\Memo File Size:  <<ALLTRIM(STR(nSize))>>  <<TRANSFORM(nSize, "@0")>>
IF nSize <= 0
	\"Memo file is empty."
	=FCLOSE(nHandle)
	SET TEXTMERGE OFF
	SET TEXTMERGE TO 
	RETURN
ENDIF
=FSEEK(nHandle, 0, 0)

* Read the header.
\
\
\***** Header Details *********
\Byte offset 0 - 3: Location of the next free block:  

cStr = FREAD(nHandle, 4)
cHex = Str2Hex(cStr)
\\<<cHex>>

* Get by Unused bytes.
cStr = FREAD(nHandle, 2)
\Byte offset 4 - 5: Unused

\Byte offset 6 - 7:  Block Size (bytes per block):  
cStr = FREAD(nHandle, 2)
cHex = [0x] + STRTRAN(Str2Hex(cStr), " ")
nBlockSize = EVALUATE(cHex)
\\<<nBlockSize>>,  <<cHex>>

* Get by Unused bytes.
cStr = FREAD(nHandle, 504)
\Byte offset 8 - 511: Unused


\
\
\***** Memo Block Data ********

* Read each block.
lEOF = .F.
*DO WHILE NOT lEOF
FOR i = 1 TO 1050
	lEOF = ReadBlock(nSize, 64)	&& Submit the file size and block size.
	IF lEOF
		EXIT
	ENDIF
ENDFOR 
*ENDDO

* Close and exit.
SET TEXTMERGE OFF
SET TEXTMERGE TO 
=FCLOSE(nHandle)
MODIFY FILE (cLogPath) NOWAIT


FUNCTION ReadBlock
PARAMETERS nFileSize, nBlockSize
LOCAL nOffset, cStr, cHex, nBlockType, nLen

	nOffset = FSEEK(nHandle, 0, 1)
	\Address: <<nOffset>>, <<TRANSFORM(nOffset, "@0")>>	
	 
	* Read the first block.
	cStr = FREAD(nHandle, nBlockSize)

	* Obtain the block type.
	\\Block Type: 
	cHex = SUBSTR(cStr, 1, 4)
	cHex = [0x] + STRTRAN(Str2Hex(cHex), " ")
	nBlockType = EVALUATE(cHex)
	\\<<nBlockType>>,  <<cHex>>	

	* Check for problems.
	IF nBlockType > 1
		\Bad Block Signature!
		RETURN .T.
	ENDIF

	* Obtain the memo length.

	\\Memo Length: 
	cHex = SUBSTR(cStr, 5, 4)
	cHex = [0x] + STRTRAN(Str2Hex(cHex), " ")
	nLen = EVALUATE(cHex)
	\\<<STR(nLen, 4)>>,  <<cHex>>	

	* Check for problems.
	IF nLen > nFileSize
		\Bad Block Length!
		RETURN .T.
	ENDIF

	* Capture the memo content.
	cMemo = SUBSTR(cStr, 9)
	\\	Content: 
	IF nBlockType = 1
		\\<<cMemo>>
	ELSE
		\\<<Str2Hex(cMemo)>>
	ENDIF

	* How many blocks to hold the memo?
	nBlocks = CEILING(nLen/64)
	FOR i = 2 TO nBlocks
	
		cStr = FREAD(nHandle, nBlockSize)
		\ <<REPLICATE(CHR(9), 23)>> 
		\\<<cStr>>

	ENDFOR 

	nOffset = FSEEK(nHandle, 0, 1)
	IF nOffset < nFileSize
		RETURN .F.
	ELSE 
		RETURN .T.
	ENDIF 
ENDFUNC 


*..............................................................................
*   Function: DEC2BASX
*    Purpose:  Convert whole number 0-?, to base 2-16 
*
* Parameters: nTempNum - number to convert (0-9007199254740992)
*             base    - base to convert to i.e., 2 4 8 16...
*    returns: string
*      Usage:  cresult=Dec2BasX(nParm1, nParm2)
*              STORE Dec2BasX(255, 16) TO cMyString  &&... cMyString contains 'ff'
*
* Taken from: http://www.tek-tips.com/faqs.cfm?fid=4461
*..............................................................................
FUNCTION dec2basx
PARAMETERS nTempNum, nNewBase

STORE 0 TO nWorkVal,;
   remainder,;
   dividend,;
   nextnum,;
   digit

nWorkVal = nTempNum  
ret_str = ''

DO WHILE .T.
   digit = MOD(nWorkVal, nNewBase)
   dividend = nWorkVal / nNewBase
   nWorkVal = INT(dividend)

   DO CASE
      CASE digit = 10
         ret_str = 'a' + ret_str
      CASE digit = 11
         ret_str = 'b' + ret_str
      CASE digit = 12
         ret_str = 'c' + ret_str
      CASE digit = 13
         ret_str = 'd' + ret_str
      CASE digit = 14
         ret_str = 'e' + ret_str
      CASE digit = 15
         ret_str = 'f' + ret_str
      OTHERWISE
         ret_str = LTRIM(STR(digit)) + ret_str
   ENDCASE

   IF nWorkVal = 0
      EXIT
   ENDIF ( nWorkVal = 0 )
ENDDO ( .T. )
RETURN ret_str
*: eof dec2basx


FUNCTION Str2Hex
PARAMETERS cStr
LOCAL i, nLen, cByte, cHex

*	Converts the contents of the submitted string to Hexidecimal format.

	nLen = LEN(cStr)
	cHex = ""
	FOR i = 1 TO nLen
		cByte = SUBSTR(cStr, i, 1)
		cHex = cHex + UPPER(Dec2BasX(ASC(cByte), 16)) + " "
	ENDFOR 
	RETURN cHex
ENDFUNC

Happy snooping.
Register to rate this FAQ  : BAD 1 2 3 4 5 6 7 8 9 10 GOOD
Please Note: 1 is Bad, 10 is Good :-)

Part and Inventory Search

Back
Top