Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

.msg parsing question.

Status
Not open for further replies.

tar565

Programmer
Jan 24, 2005
39
0
0
IE
I have written a script which firstly removes the 'from' email address from a .msg file. I inherited this algorithm (well it was handed and told to code it) but it turns out it must search through an MDaemon backup folder containing a large amount of .msg files (100,000 +). The MDaemon set up is crazy and about 5% of the addresses are only needed.

Anyway I use Email::Simple to remove the 'from' address.
Is there a quick way to loop through the .msg files and remove the addresses.

Note : The are large pauses in the execution at the moment at the moment I am using:

# change directory and get list of messages
opendir(EM_DH, $Backup_Path) or die "Cannot find to directory $Backup_Path $!";
@Emails = readdir EM_DH;
close(EM_DH);

# loop through all messages and remove addresses
foreach (@Emails) {
undef $FileString;
if ($_ =~ /msg$/i) {
$CurrEmail = $_;
open(FH, $Backup_Path.$CurrEmail) or die "Cannot open file $!";
while(<FH>) {
$FileString = $FileString.$_;
}
close(FH);

# parse the info
($FromInfo, $Body) = ParseData($FileString);
 
Just to make a suggestion for the code you posted, this might work faster than what you have but I am unsure how this is removing the adresses or if it will speed that part up:

Code:
opendir(EM_DH, $Backup_Path) or die "Cannot find to directory $Backup_Path $!";
@Emails = grep (/msg$/i, readdir EM_DH);
close(EM_DH);

# loop through all messages and remove addresses
foreach (@Emails) {
   open(FH, $Backup_Path.$_) or die "Cannot open file $!";
   my $FileString = do { local $/; <FH> };
   close(FH);
   # parse the info
   ($FromInfo,$Body) = ParseData($FileString);
 
I'd also recomend not slurping the directory handle if it has a ton of files.
On a side note I also recomend placing your file handles in a local() statement to keep them unique...
Code:
local( *EM_DH, *FH, $/ );

# get list of messages from $Backup_Path directory
opendir(EM_DH, $Backup_Path) or die "Cannot find to directory $Backup_Path $!";

# loop through all messages and remove addresses
while( readdir( EM_DH ) ){
  next unless /\.msg$/i;
  
  #Read MSG file
  my $CurrEmail = "$Backup_Path$_";
	open( FH, $CurrEmail ) or die "Error openning $CurrEmail:$!";
	
	#Slurp entire file (thanks to the local $/ above )
	my $FileString = <FH>;
	close(FH);
	
	#parse the info
	my( FromInfo, $Body ) = ParseData($FileString)
	#...
}
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top