I have a file containing posts from usenet.
I have tried this to split the file into output files containing one post each.
-----------------------------------
#!/bin/bash
# $1= Inputfile
ARR=($(grep -n 'From ' $1 | cut -d':' -f1))
i=0
let j=i+1
cnt=${#ARR[*]}
while [[ $i -lt $cnt ]]
do
sed -n "$((${ARR[$i]})) p" $1 >block$[j]
sed -n "$((${ARR[$i]}+1)),$((${ARR[$j]}-1)) p" $1 >>block$[j]
#echo
let i=i+1
let j=i+1
if [[ $j -gt ${ARR[$j]} ]]
then
break;
fi
done
exit 0
## END ####
-----------------------------------
But the last post in my large file is not put in any output files.
There are 32 posts in the large file and the posts begin with:
-------------------------------------
From mooseshoes@gmx.net Fri Mar 11 12:36:19 2005
From: "John" <mooseshoes@gmx.net>
Newsgroups: comp.os.linux.misc
Subject: Can This Hard Disk Be Saved? (formerly "Can't Find Partition Table On Boot")
Lines: 272
---------------------------------------
I want the code to put every post into separate files by splitting when 'From ' is found but I get some files containing just a small piece of text and the last post is missing.
Any idead on how to do this?
I have tried this to split the file into output files containing one post each.
-----------------------------------
#!/bin/bash
# $1= Inputfile
ARR=($(grep -n 'From ' $1 | cut -d':' -f1))
i=0
let j=i+1
cnt=${#ARR[*]}
while [[ $i -lt $cnt ]]
do
sed -n "$((${ARR[$i]})) p" $1 >block$[j]
sed -n "$((${ARR[$i]}+1)),$((${ARR[$j]}-1)) p" $1 >>block$[j]
#echo
let i=i+1
let j=i+1
if [[ $j -gt ${ARR[$j]} ]]
then
break;
fi
done
exit 0
## END ####
-----------------------------------
But the last post in my large file is not put in any output files.
There are 32 posts in the large file and the posts begin with:
-------------------------------------
From mooseshoes@gmx.net Fri Mar 11 12:36:19 2005
From: "John" <mooseshoes@gmx.net>
Newsgroups: comp.os.linux.misc
Subject: Can This Hard Disk Be Saved? (formerly "Can't Find Partition Table On Boot")
Lines: 272
---------------------------------------
I want the code to put every post into separate files by splitting when 'From ' is found but I get some files containing just a small piece of text and the last post is missing.
Any idead on how to do this?