Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

HOW TO CHOP AND GLUE BACK A BINARY FILE

Status
Not open for further replies.

TrueCode

MIS
Sep 30, 2003
71
LC
I am the guy still trying to achieve a file transfer using the winsock control. Somebody pointed me to a solution where I can convert the binary file to a base64 text file then send this text file in small chunks then accumulate the text file on the other end and decode the base64 textfile back to binary. The idea here was fine. I tried it with a 26kb photo and that worked like a breeze. But when I attempted a 393kb file it was taking like for ever to just encode to file to base64 text file the following is the code.

I first run the following code with a name as parameter. I can subsequently refer to that name as an object.

IF i had pass [oBase] as the parameter, I would encode a file as

obase.encode(cFileToCode,cCodedFile)

Is there a way that I chop up a file in VFP+API maybe and bring them back together in a way that is faster than this base64 approach.

Using Daves FTPPUT/GET suggestion works with a target computer hosting an FTP Server.









[\code]
**************************************************************************
*
* Class: Base64
*
* Purpose: Encode/Decodes a file using the Base64 encoding standard
*
* Author: Jeff Bowman <jbowman@jeffbowman.com>
*
* Credits: -Ankit Fadia <ankit@bol.net.in>
* ---------------------------------
* Provided explanation of the Base64 encoding standard:
* *
*
* -Albert Ballinger <albert_j_ballinger@dresser-rand.com>
* -------------------------------------------------------------
* Assisted with code execution speed

* The first few lines before the DEFINE CLASS Code was modified from the original to include the mSub codes.
*
**************************************************************************
LPARAMETERS cSessionName

#DEFINE vfpCr CHR(13)
#DEFINE vfpLf CHR(10)
#DEFINE vfpCrLf vfpCr + vfpLf


mSub1 = [PUBLIC ]+ALLTRIM(cSessionName)+[ AS Session ]
&mSub1

mSub2 = ALLTRIM(cSessionName) + [= CREATEOBJECT(&quot;Base64&quot;)]
&mSub2

*!* *************************************************

DEFINE CLASS Base64 AS Session
DIMENSION aBase64[64]

*********************************************************************************
PROCEDURE Encode(tcInFile, tcOutFile)
*********************************************************************************
*
* Method: Encode()
*
* Purpose: Converts the specified file to Base64 encoding
*
* Remarks:
*
*******************************************************************************
* Character variables
LOCAL ;
lcOutFile, ;
lcBase64, ;
lcInFile, ;
lcLine, ;
lcStr, ;
lcBin

* Integer variables
LOCAL ;
liDecimal, ;
liHandle, ;
i, ;
j

* Only proceed if the specified file exists
IF FILE(tcInFile)
* Create the output file, overwriting it if it exists
liHandle = FCREATE(tcOutFile)

* Only proceed if we could create the output file
IF liHandle > 0
* Set some starting values
lcInFile = FILETOSTR(tcInFile)
lcBase64 = &quot;&quot;
lcOutFile = &quot;&quot;

* Get the characters from the input
* file, in groups of three each
FOR i = 1 TO LEN(lcInFile) STEP 3
lcStr = SUBSTR(lcInFile, i, 3)

* Encode the binary code to Base64 and
* add it to the existing output string
lcBase64 = lcBase64 + This.Split4(lcStr)
ENDFOR

* Build the encoded output string,
* in lines of 76 characters each
FOR i = 1 TO LEN(lcBase64) STEP 76
lcOutFile = lcOutFile + SUBSTR(lcBase64, i, 76) + vfpCrLf
ENDFOR

* Write & close the output file
FWRITE(liHandle, lcOutFile)
FCLOSE(liHandle)
ENDIF
ENDIF
ENDPROC



*********************************************************************************
PROCEDURE Decode(tcInFile, tcOutFile)
*********************************************************************************
*
* Method: Decode()
*
* Purpose: Converts the specified file from Base64 encoding
*
* Remarks: Input file must contain no extra characters
* other than Base64 encoded characters, and each
* line must be exactly 76 characters long
*
*******************************************************************************
* Character variables
LOCAL ;
lcOutFile, ;
lcInFile, ;
lcString, ;
lcBinary, ;
lcChr, ;
lcBin

* Integer variables
LOCAL ;
liHandle, ;
liDec, ;
i, ;
j

* Only proceed if the specified file exists
IF FILE(tcInFile)
* Create the output file, overwriting it if it exists
liHandle = FCREATE(tcOutFile)

* Only proceed if we could create the output file
IF liHandle > 0
* Set some starting values
lcInFile = FILETOSTR(tcInFile)
lcOutFile = &quot;&quot;

* Strip out linefeeds, so we have
* one continous string to process
lcInFile = CHRTRAN(lcInFile, vfpCrLf, &quot;&quot;)

* Get the characters from the input
* file, in groups of four each
FOR i = 1 TO LEN(lcInFile) STEP 4
lcString = SUBSTR(lcInFile, i, 4)
WITH This
lcString = ;
CHR(ASCAN(.aBase64, SUBSTR(lcString, 1, 1)) - 1) + ;
CHR(ASCAN(.aBase64, SUBSTR(lcString, 2, 1)) - 1) + ;
CHR(ASCAN(.aBase64, SUBSTR(lcString, 3, 1)) - 1) + ;
CHR(ASCAN(.aBase64, SUBSTR(lcString, 4, 1)) - 1)

* Decode the binary code from Base64
* and build the existing output string
lcOutFile = lcOutFile + .Split3(lcString)
ENDWITH
ENDFOR

* Write & close the output file
FWRITE(liHandle, lcOutFile)
FCLOSE(liHandle)
ENDIF
ENDIF
ENDPROC



*********************************************************************************
HIDDEN PROCEDURE Split3(tcBinary)
*********************************************************************************
*
* Method: Split3()
*
* Purpose: Splits a 24-bit binary string into 3 8-bit
* strings, converts them into decimal values and
* returns them as concatenated ASCII characters
*
* Remarks:
*
*******************************************************************************
WITH This
LOCAL ;
liOutByte1, ;
liOutByte2, ;
liOutByte3, ;
liInByte1, ;
liInByte2, ;
liInByte3, ;
liInByte4

liInByte1 = ASC(SUBSTR(tcBinary, 1, 1))
liInByte2 = ASC(SUBSTR(tcBinary, 2, 1))
liInByte3 = ASC(SUBSTR(tcBinary, 3, 1))
liInByte4 = ASC(SUBSTR(tcBinary, 4, 1))

liOutByte1 = BITLSHIFT(BITAND(liInByte1, 0x3f), 2) + BITRSHIFT(liInByte2, 4)
liOutByte2 = BITLSHIFT(BITAND(liInByte2, 0x0f), 4) + BITRSHIFT(liInByte3, 2)
liOutByte3 = BITLSHIFT(BITAND(liInByte3, 0x03), 6) + BITAND(liInByte4, 0x3f)

RETURN ;
CHR(liOutByte1) + ;
CHR(liOutByte2) + ;
CHR(liOutByte3)
ENDWITH
ENDPROC



*********************************************************************************
HIDDEN PROCEDURE Split4(tcBinary)
*********************************************************************************
*
* Method: Split4()
*
* Purpose: Splits a 24-bit binary string into 4 6-bit
* strings, converts them into decimal values and
* returns them as concatenated Base64 characters
*
* Remarks:
*
*******************************************************************************
LOCAL ;
liOutByte1, ;
liOutByte2, ;
liOutByte3, ;
liOutByte4, ;
liInByte1, ;
liInByte2, ;
liInByte3

liInByte1 = ASC(SUBSTR(tcBinary, 1, 1))
liInByte2 = ASC(SUBSTR(tcBinary, 2, 1))
liInByte3 = ASC(SUBSTR(tcBinary, 3, 1))

liOutByte1 = BITRSHIFT(liInByte1, 2)
liOutByte2 = BITLSHIFT(BITAND(liInByte1, 0x03), 4) + BITRSHIFT(BITAND(liInByte2, 0xf0), 4)
liOutByte3 = BITLSHIFT(BITAND(liInByte2, 0x0f), 2) + BITRSHIFT(BITAND(liInByte3, 0xc0), 6)
liOutByte4 = BITAND(liInByte3, 0x3f)

WITH This
RETURN ;
.aBase64[liOutByte1 + 1] + ;
.aBase64[liOutByte2 + 1] + ;
.aBase64[liOutByte3 + 1] + ;
.aBase64[liOutByte4 + 1]
ENDWITH
ENDPROC



*********************************************************************************
HIDDEN PROCEDURE Init()
*********************************************************************************
*
* Method: Init()
*
* Purpose:
*
* Remarks: Build the encode/decode array, based
* on the following Base64 lookup table
*
* -------------------------------------------
* | 0 = A | 16 = Q | 32 = g | 48 = w |
* | 1 = B | 17 = R | 33 = h | 49 = x |
* | 2 = C | 18 = S | 34 = i | 50 = y |
* | 3 = D | 19 = T | 35 = j | 51 = z |
* | 4 = E | 20 = U | 36 = k | 52 = 0 |
* | 5 = F | 21 = V | 37 = l | 53 = 1 |
* | 6 = G | 22 = W | 38 = m | 54 = 2 |
* | 7 = H | 23 = X | 39 = n | 55 = 3 |
* | 8 = I | 24 = Y | 40 = o | 56 = 4 |
* | 9 = J | 25 = Z | 41 = p | 57 = 5 |
* | 10 = K | 26 = a | 42 = q | 58 = 6 |
* | 11 = L | 27 = b | 43 = r | 59 = 7 |
* | 12 = M | 28 = c | 44 = s | 60 = 8 |
* | 13 = N | 29 = d | 45 = t | 61 = 9 |
* | 14 = O | 30 = e | 46 = u | 62 = + |
* | 15 = P | 31 = f | 47 = v | 63 = / |
* -------------------------------------------
*
*******************************************************************************
LOCAL i

FOR i = 0 TO 63
DO CASE
CASE BETWEEN(i, 0, 25)
This.aBase64[i + 1] = CHR(i + 65)

CASE BETWEEN(i, 26, 51)
This.aBase64[i + 1] = CHR(i + 71)

CASE BETWEEN(i, 52, 61)
This.aBase64[i + 1] = CHR(i - 4)

CASE i = 62
This.aBase64[i + 1] = &quot;+&quot;

CASE i = 63
This.aBase64[i + 1] = &quot;/&quot;

ENDCASE
ENDFOR
ENDPROC
ENDDEFINE

Code:
------------------------------------->

&quot;I have sought your assistance on this matter because I have exhausted all the help that I can find.  You are free to direct me to other source of help&quot;
 
I used the following code using FWRITE/FREAD however, when i comes to certain files it does not work.

I tried it with .jpg,.doc,.xls files when I tried it on a VFP .dbf. It tells me the rejoined file is not a table

So I am still stuck.


[\Code]
cInfile = GETFILE([*],[Select a file Cut up and Rejoin],[Open],0,[Cutting up and Rejoining files in VFP])

cOutfile = PUTFILE([Select the resulting REJOINED file],[JoinedFile])
IF EMPTY(cInfile) OR EMPTY(cOutfile)
WAIT WINDOW [You must choose both a source and target file]
RETURN
ENDIF


nhandle = FOPEN(cInfile)
nFilesize = FSEEK(nHandle,0,2)
=FCLOSE(nHandle)
nhandle = FOPEN(cInfile)

nHandle2 = FCREATE(cOutfile)
IF nHandle = -1 OR nHandle2 = -1
WAIT WINDOW [both source and destination files must be accessible]
=FCLOSE(nHandle)
=FCLOSE(nHandle2)
RETURN
ENDIF

nRemain = nFilesize
DO WHILE !FEOF(nHandle)
nChunk = MIN(100,nRemain)
cTake = FREAD(nHandle,nChunk)

=FWRITE(nHandle2,ctake,nChunk)

nRemain = nRemain - nChunk

IF nRemain = 0
EXIT
ENDIF
ENDDO
=FCLOSE(nHandle)
=FCLOSE(nHandle2)

WAIT WINDOW [PROCESS DONE]

Code:
------------------------------------->

&quot;I have sought your assistance on this matter because I have exhausted all the help that I can find.  You are free to direct me to other source of help&quot;
 
It might be the way you are reassembling the file.

I use a very similar routine to yours to break files up, apart from adding an additional segment (containing the number of segments used) and then have an application to reassemble them. But, you can use a simple DOS level copy command to do it - as long as you remember the /B parameter.



Regards

Griff
Keep [Smile]ing
 
GriffMG

I want to cut up the file in small pieces to send through a Winsock Control and to assemble the file on the other end. As I said, my example works well with .jpg,.xls,.doc (those I have tested so far) and .dbf. The .dbf does not work. It tells me the assembled file is not a table.




------------------------------------->

&quot;I have sought your assistance on this matter because I have exhausted all the help that I can find. You are free to direct me to other source of help&quot;
 
I've done a little testing and had some good results with the following method, which uses low-level I/O. Some assuptions in the sample, the file will already be open read/write on the client side with FOPEN(), the data sent by the server using FREAD() and SendData, the client uses FWRITE() to write out the data. In your initialization section on both the client and server, add this declaration:
Code:
DECLARE &quot;RtlMoveMemory&quot; IN &quot;KERNEL32&quot; as CopyMemory ;
          string ,;
          string ,;
          long

On the server side, wherever you get the request for a file:
Code:
STORE 1024 TO nChunk
STORE ''   TO cTmpText 
STORE 0    TO nHandle 
nHandle = Fopen('somefile.ext')

IF nHandle > 0
   DO WHILE !Feof(nHandle)
      cTmpText = ''
      cTmpText = Fread(nHandle, nChunk)  
      STORE Replicate(Chr(0), Len(cTmpText )) TO bt
CopyMemory(@bt, cTmpText , Len(cTmpText ))
Code:
      THIS.OBJECT.SENDDATA (bt)
      *... add 'ACK' or pause functionality here 
   ENDDO 
ENDIF 
Fclose(nHandle)   
cTmpText = ''

In the DataArrival event of the client, add something like this:
Code:
*** ActiveX Control Event ***
LPARAMETERS tnBytesTotal
LOCAL lcBuffer

IF not_end_of_trans
   m.lcBuffer = REPLICATE (&quot; &quot;, m.tnBytesTotal)
   THIS.OBJECT.GETDATA (@m.lcBuffer)
   STORE Space(tnBytesTotal) TO bt
CopyMemory(@bt, @m.lcBuffer, tnBytesTotal)
Code:
   Fseek(nHandle, 0, 2)  &&... go to FEOF()
   Fwrite(nHandle, bt)
ELSE
   Fclose(nHandle)
ENDIF



-Dave S.-
[cheers]
Even more Fox stuff at:
 
Dave

You deserve a star for this. I got it to work with the Splitting and Joining program. I will now try sending it over the socket... I expect the same level of success.

Here is the resulting CHOPANDGLUEFILES.PRG


[\Code]
**** CHOPANDGLUEFILES.PRG ***************
**
** THE FOLLOWING IS THE RESULT OF A COLLABORATION BETWEEN DSummZZZ and TrueCode
** The Program allows you to cut up any file and reassemble them in VFP
*********************************************************


DECLARE &quot;RtlMoveMemory&quot; IN &quot;KERNEL32&quot; as CopyMemory ;
string ,;
string ,;
long

cInfile = GETFILE([*],[Select a file Cut up and Rejoin],[Open],0,[Cutting up and Rejoining files in VFP])

cOutfile = PUTFILE([Select the resulting REJOINED file],JUSTFNAME(cInfile))
IF EMPTY(cInfile) OR EMPTY(cOutfile)
WAIT WINDOW [You must choose both a source and target file]
RETURN
ENDIF


nhandle = FOPEN(cInfile)
nFilesize = FSEEK(nHandle,0,2)
=FCLOSE(nHandle)
nhandle = FOPEN(cInfile)

nHandle2 = FCREATE(cOutfile)
IF nHandle = -1 OR nHandle2 = -1
WAIT WINDOW [both source and destination files must be accessible]
=FCLOSE(nHandle)
=FCLOSE(nHandle2)
RETURN
ENDIF

nRemain = nFilesize
DO WHILE !FEOF(nHandle)
nChunk = MIN(1024,nRemain)
cTmpText = FREAD(nHandle,nChunk)
STORE Replicate(Chr(0), Len(cTmpText)) TO bt
CopyMemory(@bt, cTmpText , Len(cTmpText ))

=FSEEK(nHandle2,0,2)

=FWRITE(nHandle2,bt)

nRemain = nRemain - nChunk

IF nRemain = 0
EXIT
ENDIF
ENDDO
=FCLOSE(nHandle)
=FCLOSE(nHandle2)

WAIT WINDOW [PROCESS DONE]

Code:
------------------------------------->

&quot;I have sought your assistance on this matter because I have exhausted all the help that I can find.  You are free to direct me to other source of help&quot;
 
Nice converter, but it's missing one key element.

When the last chunk isn't a 24 bit value, you should pad the output with ='s.

Below, I've shown what gets returned for three variations on the same text.

Darrell

[tt]
Convert this file to base64.
Returns: Q29udmVydCB0aGlzIGZpbGUgdG8gYmFzZTY0LgAA
Should be: Q29udmVydCB0aGlzIGZpbGUgdG8gYmFzZTY0Lg==

Convert this file to base64
Returns: Q29udmVydCB0aGlzIGZpbGUgdG8gYmFzZTY0
(exactly creates 24bit segments)

Convert this file to base
Returns: Q29udmVydCB0aGlzIGZpbGUgdG8gYmFzZQAA
Should be: Q29udmVydCB0aGlzIGZpbGUgdG8gYmFzZQ==
[/tt]
 
Oh ya.

Here's the relevant text from RFC1341.
(
Darrell

[tt]
The output stream (encoded bytes) must be represented in lines of no more than 76 characters each. All line breaks or other characters not found in Table 1 must be ignored by decoding software. In base64 data, characters other than those in Table 1, line breaks, and other white space probably indicate a transmission error, about which a warning message or even a message rejection might be appropriate under some circumstances.

Special processing is performed if fewer than 24 bits are available at the end of the data being encoded.
A full encoding quantum is always completed at the end of a body. When fewer than 24 input bits are available in an input group, zero bits are added (on the right) to form an integral number of 6-bit groups. Output character positions which are not required to represent actual input data are set to the character "=". Since all base64 input is an integral number of octets, only the following cases can arise: (1) the final quantum of encoding input is an integral multiple of 24 bits; here, the final unit of encoded output will be an integral multiple of 4 characters with no "=" padding, (2) the final quantum of encoding input is exactly 8 bits; here, the final unit of encoded output will be two characters followed by two "=" padding characters, or (3) the final quantum of encoding input is exactly 16 bits; here, the final unit of encoded output will be three characters followed by one "=" padding character.

Care must be taken to use the proper octets for line breaks if base64 encoding is applied directly to text material that has not been converted to canonical form. In particular, text line breaks should be converted into CRLF sequences prior to base64 encoding. The important thing to note is that this may be done directly by the encoder rather than in a prior canonicalization step in some implementations.

NOTE: There is no need to worry about quoting apparent encapsulation boundaries within base64-encoded parts of multipart entities because no hyphen characters are used in the base64 encoding.
[/tt]
 
Why is it necessary to encode the file and then decode it again? I'm not sure I'm following this, or perhaps I missed something important in the discussion? Encoding is always going to be slow when compared to just sending the bytes of the file across the wire no matter what algorithm you use... I keep rereading this thread thinking that I must be missing something. Can someone help me understand?

boyd.gif

[sub]craig1442@mchsi.com[/sub][sup]
&quot;Whom computers would destroy, they must first drive mad.&quot; - Anon​
[/sup]
 
Good question,

I personally don't know at which points it can happen, but differnt computers use different encoding schemes for text. IBM AS/400's use EBCDIC instead of ASCII, so 'A'(ebcdic)<>'A'(ascii). Unix doesn't use CR characters (or is that LF's?) while PC's use CR+LF.

At different points along the way, data gets auto-converted among these formats. Since I can't predict reliably when it will or will not happen, base64 encoding converts all the data to only those characters which are known to not get converted.

When the connection is between two Windows computers both running VFP, the conversion is probably not necessary, and bloats the size of the data by about 200%.
 
Darrell,

What's the point of using rtlmovememory instead of an assignment?

ie:
Code:
IF nHandle > 0
   DO WHILE !Feof(nHandle)
      cTmpText = ''
      cTmpText = Fread(nHandle, nChunk)  


*** How is this:
      STORE Replicate(Chr(0), Len(cTmpText )) TO bt
      CopyMemory(@bt, cTmpText , Len(cTmpText ))
      THIS.OBJECT.SENDDATA (bt)

*** Better than this:
      THIS.OBJECT.SENDDATA (cTmpText)

*** ?

      *... add 'ACK' or pause functionality here 
   ENDDO 
ENDIF
 
Craig:

I'm pretty sure the purpose of this was to send MIME 1.0
compliant emails that will get through all mail servers.

I just happened to be writting a Base64 encoder as part of
an automatic emaill class that must run on a server with
out a UI. At the time I was working on it, I found this
thread by performing a search and was just complimenting
the poster on the job done. (sorry for any confusion)

I wrote two: one in VFP and the other in C++.

The one in VFP although functional is too slow for files larger than 3-5k.

Obviously, I'll be using the one in C++ for speed purposes.

Darrell
 
wgcs:

I didn't write the code you are referring to - DSummZZZ did.

I haven't tested what DSummZZZ wrote, but they may have
found in testing that it is faster than using VFP's
navtive assignment statement, but I'm unsure.

I can't actually see how it would be faster, since VFP
would be using something similiar internally anyway and
calling out to an API would add further time delays.
( (maybe) - I'll perform some timimg tests)

Darrell
 
Sorry, Darrell, I should have made sure to be responding to the correct author!

You came to the same conclusion I did... though I can't see how an additional REPL(), STORE AND rtlMoveMemory call could together be faster than no assignment at all, since it would seem to me that the variable gotten from FREAD is good enough to send right out with SendData.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top