Select every nth line

rkdash · Dec 11, 2004

Hi,

I have a large ascii (3-column ) file ( > 2 Gb). So I want to resample it to select, for example, every 5th line. How to do this ?

Thanks

futurelet · Dec 11, 2004

[tt]
awk '!(NR%5)' file
[/tt]

olded · Dec 11, 2004

I know this is an awk forum, but I'll bet the sed version is faster on a 2-gig file:

sed -n 'n;n;n;n;p' file

rkdash · Dec 11, 2004

Thanks futurelet and olded.

I tried with:

sed -n '1~5p' infile > outfile

and it worked.

futurelet · Dec 11, 2004

> I know this is an awk forum, but I'll bet the sed version
> is faster on a 2-gig file:

Why?

olded · Dec 12, 2004

>Why?

Because sed has a smaller foot print than awk. Of course, that's just a guess on my part; you'd have to test it.

If this problem was a one time only deal, I wouldn't worry about it.

futurelet · Dec 12, 2004

I'm not sure what you mean by footprint. On my system, sed.exe is 1,060,864 bytes; one version of mawk is 67,072 bytes (almost a 16:1 ratio).

Awk spends some time parsing every line into fields, but that time is probably negligible compare to the time needed to read and write the line.

olded · Dec 12, 2004

Hmmm... That's interesting.

On my solaris 7 system, here are the sizes:

76408 Sep 1 1998 /bin/awk
104656 Feb 15 2002 /bin/nawk
24796 Feb 28 2002 /bin/sed

As I said, it would be an interesting technical exercise to see which is faster. I wish I had the time ....

How about that? Two guys are arguing about who's "tool" is smaller. -) -)

Ed

futurelet · Dec 12, 2004

Thanks for the info. I was wondering whether sed was really so complex that it had to be that huge.

The version I use under DOS is
GNU sed version 4.0.7 (DJGPP port 2003-09-09 (r1))

It must be grossly bloated due to linked-in libraries.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Select every nth line

rkdash

Technical User

futurelet

Programmer

olded

Programmer

rkdash

Technical User

futurelet

Programmer

olded

Programmer

futurelet

Programmer

olded

Programmer

futurelet

Programmer

Similar threads

Part and Inventory Search

Sponsor