Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

using awk to split a file into many output files...

Status
Not open for further replies.

0492

IS-IT--Management
Mar 26, 2005
1
AU
I have a file containing posts from usenet.

I have tried this to split the file into output files containing one post each.

-----------------------------------
#!/bin/bash
# $1= Inputfile
ARR=($(grep -n 'From ' $1 | cut -d':' -f1))

i=0
let j=i+1

cnt=${#ARR[*]}

while [[ $i -lt $cnt ]]
do

sed -n "$((${ARR[$i]})) p" $1 >block$[j]
sed -n "$((${ARR[$i]}+1)),$((${ARR[$j]}-1)) p" $1 >>block$[j]
#echo
let i=i+1
let j=i+1
if [[ $j -gt ${ARR[$j]} ]]
then
break;
fi

done

exit 0
## END ####
-----------------------------------

But the last post in my large file is not put in any output files.
There are 32 posts in the large file and the posts begin with:
-------------------------------------
From mooseshoes@gmx.net Fri Mar 11 12:36:19 2005
From: "John" <mooseshoes@gmx.net>
Newsgroups: comp.os.linux.misc
Subject: Can This Hard Disk Be Saved? (formerly "Can't Find Partition Table On Boot")
Lines: 272
---------------------------------------
I want the code to put every post into separate files by splitting when 'From ' is found but I get some files containing just a small piece of text and the last post is missing.

Any idead on how to do this?
 
you have an oFf by one issue. your While loop only processes the first n-1 posts.

you need to force your while loop to go around one more time by adding the line count

cat $1 | wc -l

to the end of ARR, or process the last post after the while loop by duplicating the code using

cat $1 | wc -l

as the end value.

 
Perhaps consider using [tt]csplit[/tt], e.g...
Code:
csplit infile '/^From /' '{*}'
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top