Retaining leading spaces and blank lines...

SGM · Mar 6, 2002

I have a text file that contains several email messages in it. I'm trying to write a shell script that will read this file and parse each email out to a separate text file. The individual files are begun whenever I read a unique text line. Sample code is below. The problem is that the initial file contains leading spaces on some lines as well as blank lines. These must all be maintained in the individual output files. I thought that enclosing the output line $LINE in quotes would retain the leading spaces (it does from the command line), but it did not. This is running on Linux 6.1. Any suggestions would be most appreciated!

FILE=emails.txt
FILECNT=0
while read line
do
if [ `echo $line | grep "Unique data"` ]
then
FILECNT=` expr $FILECNT + 1`
OUTFILE=email$FILECNT.txt
fi
echo "$LINE" >> $OUTFILE
done < $FILE

Thanks,
Steve

bigoldbulldog · Mar 6, 2002

Steve -

Your problem is either due to the echo command itself or more likely the read line. Anyway here is an awk solution to keep any of your spaces or tabs.

#! /bin/sh
FILE=emails.txt
awk '
/Unique data/ {
FILECNT++
OUTFILE="email" FILECNT ".txt"
LINE=$0
printf( "%s\n", LINE ) >> OUTFILE
}
' < $FILE

All the best,
ND

SGM · Mar 6, 2002

ND,

Thanks, but still have a problem. My test file had 6 email messages in and your code from above created 6 separate files (email1.txt through email6.txt). However, the only thing contained in each file was the line containing the "unique data". Wish I knew more about "awk"!!!

Steve

CaKiwi · Mar 6, 2002

Here's my attempt.

Code:

{
  if ($0 ~ /Unique data/) {
    if (FILECNT) close(OUTFILE)
    FILECNT++
    OUTFILE=&quot;email&quot; FILECNT &quot;.txt&quot;
  }
  if (FILECNT) print >> OUTFILE
}

Hope this helps. CaKiwi

bigoldbulldog · Mar 6, 2002

I guess I don't quite understand the input and the desired output. I tried to translate you code into awk.

Is the input a single file with data or does it contain a list of data files?

Is the output one file and what should it say? or is the output multiple files.

I'll be glad help when I know more about the problem. You may be able to simplify the awk code of mine to fix your original code.

Best of luck,
ND

SGM · Mar 6, 2002

Sorry for the confusion...

The input file is one big file that contains multiple mail messages strung together. Each email message contains multiple lines some of which may be blank or have leading spaces. My objective is to extract each email message from the big input file and put it into a separate output file in exactly the same format. My test file, for example, has 6 email messages in it so I am expecting 6 separate output files; one message per file. The first line of every email message contains unique data that I can use to always be the beginning of a new message. My script creates a new output file each time it sees a line containing this unique data. Then, all subsequent lines are also written to the same output file until either the end-of-file is encountered or another line containing the unique data is found.

I can make this work; I get all 6 output files and they all contain the right data lines. They just don't have any blank lines that might have been in the input file nor do any lines have leading spaces that also might have been in the input file.

Thanks again,
Steve

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Retaining leading spaces and blank lines...

SGM

Technical User

bigoldbulldog

Programmer

SGM

Technical User

CaKiwi

Programmer

bigoldbulldog

Programmer

SGM

Technical User

Similar threads

Part and Inventory Search

Sponsor