Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

make 2 or multiple blank lines a RS 1

Status
Not open for further replies.

alexknt

Technical User
Sep 24, 2009
4
LU
Hi,

here is my input file:
#cat file
1
2

3



4

5


6
7

I want to split this file into files with the record separator RS = 2 or more blank lines.

Therefore I should obtain:
#cat 001.lct
1
2

3
#cat 002.lct
4

5
#cat 00.lct
6
7

My command :
awk ' BEGIN { RS="" } { i++; f=sprintf("%03d.lct",i);print $0 > f }' file

gives 5 files because it considers one blank line as a RS.

How to declare RS as 2 or more blank lines BUT not 1.

Thanks for your help,

Alexknt








 
Hi

alexknt said:
I want to split this file into files with the record separator RS = 2 or more blank lines.
That means 3 or more consecutive end-of-line marks :
Code:
awk -vRS='\n\n\n+' '{print>sprintf("%03d.lct",NR)}' /input/file
Tested with [tt]gawk[/tt] and [tt]mawk[/tt].


Feherke.
 
Hi

alexknt said:
How to declare RS as 2 or more blank lines BUT not 1.
You can not. But with [tt]gawk[/tt] the substring matched by [tt]RS[/tt] is available in [tt]RT[/tt]. So you can check its [tt]length()[/tt] and increment the file counter only when needed :
Code:
awk ' BEGIN{RS="";f=sprintf("%03d.lct",++i)}{print$0RT>f}length(RT)>2{f=sprintf("%03d.lct",++i)}' /input/file

Feherke.
 
Thanks for your help feherke

I have already tried your 1st solution without success.
Code:
awk -vRS='\n\n\n+' '{print>sprintf("%03d.lct",NR)}' /input/file

but it gives me 13 files ( 1 by line from input file ) instead of 3 files.

I don't know why it does not work.

FYI:
#od -c file
0000000 1 \n 2 \n \n 3 \n \n \n 4 \n \n 5 \n \n \n
0000020 6 \n 7 \n
0000024

my awk command runs on a AIX 5.3 server.
... and unfortunately, I don't have neither gawk nor mawk on my machine.

Thanks

Alexknt


 
feherke,

the command
Code:
awk -vRS='\n\n\n+' '{print>sprintf("%03d.lct",NR)}' /input/file
works perfectly on my ubuntu server.

pb seems to come from my awk command on my AIX box.

but I don't know how to bypass this pb

Alexknt
 
Hi

alexknt said:
my awk command runs on a AIX 5.3 server.
... and unfortunately, I don't have neither gawk nor mawk on my machine.
See if you have [tt]nawk[/tt]. Standard [tt]awk[/tt] implementations accept only one character for [tt]RS[/tt]. [tt]nawk[/tt] is an extended implementation, maybe it handles regular expressions in [tt]RS[/tt].

Feherke.
 
Hi

If you want a standard [tt]awk[/tt] solution, you have to give up with the [tt]RS[/tt] manipulation and count empty lines "manually" :
Code:
awk 'BEGIN{f=sprintf("%03d.lct",++i)}$0==""{e++}$0!=""&&e>2{e=0;f=sprintf("%03d.lct",++i)}{print$0>f}' /input/file
Tested with [tt]gawk[/tt] and [tt]mawk[/tt].

Feherke.
 
Hi feherke,

your last solution works perfectly in AIX and Ubuntu.

Many thanks for your help,

Alexknt
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top