Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Select every nth line

Status
Not open for further replies.

rkdash

Technical User
Jul 15, 2004
14
US
Hi,

I have a large ascii (3-column ) file ( > 2 Gb). So I want to resample it to select, for example, every 5th line. How to do this ?

Thanks
 
I know this is an awk forum, but I'll bet the sed version is faster on a 2-gig file:

sed -n 'n;n;n;n;p' file

 
Thanks futurelet and olded.

I tried with:

sed -n '1~5p' infile > outfile

and it worked.

 
> I know this is an awk forum, but I'll bet the sed version
> is faster on a 2-gig file:

Why?
 
>Why?

Because sed has a smaller foot print than awk. Of course, that's just a guess on my part; you'd have to test it.

If this problem was a one time only deal, I wouldn't worry about it.
 
I'm not sure what you mean by footprint. On my system, sed.exe is 1,060,864 bytes; one version of mawk is 67,072 bytes (almost a 16:1 ratio).

Awk spends some time parsing every line into fields, but that time is probably negligible compare to the time needed to read and write the line.
 
Hmmm... That's interesting.

On my solaris 7 system, here are the sizes:

76408 Sep 1 1998 /bin/awk
104656 Feb 15 2002 /bin/nawk
24796 Feb 28 2002 /bin/sed

As I said, it would be an interesting technical exercise to see which is faster. I wish I had the time ....

How about that? Two guys are arguing about who's "tool" is smaller. -) -)

Ed
 
Thanks for the info. I was wondering whether sed was really so complex that it had to be that huge.

The version I use under DOS is
GNU sed version 4.0.7 (DJGPP port 2003-09-09 (r1))

It must be grossly bloated due to linked-in libraries.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top