Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Base64 problem in strings

Status
Not open for further replies.

Hexes

Programmer
Oct 21, 2010
21
0
0
Good day all I have a problem with a testing of sorts. I have a server that receives mails with attachments. Now this server converts that to tiff and faxes it out bot some mails come in corrupt and the system gets the attachment mixed up with the body of the mail. The result is a 75 page fax with garble in it and it is wasting a lot of money.
Now what I need is a way to detect base64 code and remove it from the string before it ends up in the fax but how to check ?

Using :
base64_decode( $ body);
doesn’t work to many false positives.


Going the long way doesn’t work because base64 code slips past.

<?php
function base64_decode_fix( $ body, $strict = false )
{
if( $strict )
if( preg_match( '![^a-zA-Z0-9/+=]!', $ body) )
return( false );

return( base64_decode( $ body) );
}
?>

The thing that makes this hard is the fact that some mails have huge legitimate text bodies so looking for suspiciously long strings in the body doesn’t work well to many false positives.

Any ideas how to work around this ? and get rid of the base64 in the $body string ?
 
It would be interesting in one of the corrupt mails to see the entire message. I recall the emails are well defined in terms of headers and boundaries (e.g. emals with say 3 word docs attached have three sections with a known start string (usualy a long random string with two hyphens at the start I think) followed by base64 then the start string again so the email software can see the end of the base64. This would be repeated another 2 times (one for each attached)
Unless you can get a better handle on what and where the corruption is you wont be able to detect only base64 (as they are only chartacters at the end the day). If you understand the issue a bit more you might get enough of a pattern to strip it out.
Do you know why you get them corrupt? can you not get the sender to do something?
 
test for large blocks of text using the base64 character set but containing no spaces.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top