I am trying to parse a space delimited input file that looks like:
001 1001 683784.0 9858088.0 4 1412
001 1001 683784.0 9858088.0 100 1412
001 1001 683784.0 9858088.0 200 1411
001 1001 683784.0 9858088.0 300 1411
001 1001 683784.0 9858088.0 400 1411
001 1002 683928.0 9858047.0 4 1409
001 1002 683928.0 9858047.0 100 1409
001 1002 683928.0 9858047.0 200 1409
001 1002 683928.0 9858047.0 300 1409
001 1002 683928.0 9858047.0 400 1409
001 1003 683928.0 9858047.0 4 1409
001 1003 683928.0 9858047.0 100 1409
001 1003 683928.0 9858047.0 200 1409
001 1003 683928.0 9858047.0 300 1409
001 1003 683928.0 9858047.0 400 1409
001 1004 683928.0 9858047.0 4 1409
001 1004 683928.0 9858047.0 100 1409
001 1004 683928.0 9858047.0 200 1409
001 1004 683928.0 9858047.0 300 1409
001 1004 683928.0 9858047.0 400 1409
001 1005 683928.0 9858047.0 4 1409
001 1005 683928.0 9858047.0 100 1409
001 1005 683928.0 9858047.0 200 1409
001 1005 683928.0 9858047.0 300 1409
001 1006 683928.0 9858047.0 4 1409
001 1006 683928.0 9858047.0 100 1409
001 1006 683928.0 9858047.0 200 1409
001 1006 683928.0 9858047.0 300 1409
001 1006 683928.0 9858047.0 400 1409
001 1006 683928.0 9858047.0 500 1409
001 1007 683928.0 9858047.0 4 1409
001 1007 683928.0 9858047.0 100 1409
001 1007 683928.0 9858047.0 200 1409
001 1007 683928.0 9858047.0 300 1409
001 1007 683928.0 9858047.0 400 1409
001 1007 683928.0 9858047.0 500 1409
002 1000 683928.0 9858047.0 4 1409
002 1000 683928.0 9858047.0 100 1409
002 1000 683928.0 9858047.0 200 1409
002 1000 683928.0 9858047.0 300 1409
002 1000 683928.0 9858047.0 400 1409
002 1000 683928.0 9858047.0 500 1409
002 1001 683928.0 9858047.0 4 1409
002 1001 683928.0 9858047.0 100 1409
002 1001 683928.0 9858047.0 200 1409
002 1001 683928.0 9858047.0 300 1409
002 1001 683928.0 9858047.0 400 1409
002 1001 683928.0 9858047.0 500 1409
002 1002 683928.0 9858047.0 4 1409
002 1002 683928.0 9858047.0 100 1409
002 1002 683928.0 9858047.0 200 1409
002 1002 683928.0 9858047.0 300 1409
002 1002 683928.0 9858047.0 9600 2400
002 1002 683928.0 9858047.0 9700 2401
002 1002 683928.0 9858047.0 10000 2478
002 1003 683928.0 9858047.0 4 1409
002 1003 683928.0 9858047.0 100 1409
002 1003 683928.0 9858047.0 200 1409
002 1003 683928.0 9858047.0 300 1409
002 1003 683928.0 9858047.0 400 1409
002 1003 683928.0 9858047.0 500 1409
...etc
Field 1 is a grouping number - not necessiarly incrementing by one.
Field 2 is a sub-number of field 1 with several sets of data following in the rest of the data record, also not necessiarly incrementing by one.
I want to parse the data so that I would have the output thinned by the combination of field 1 and 2.
I want to thin the ouput by printing every 3rd group of field 2 as related to field one.
Thus the ouput would be like:
001 1001 683784.0 9858088.0 4 1412
001 1001 683784.0 9858088.0 100 1412
001 1001 683784.0 9858088.0 200 1411
001 1001 683784.0 9858088.0 300 1411
001 1001 683784.0 9858088.0 400 1411
001 1004 683928.0 9858047.0 4 1409
001 1004 683928.0 9858047.0 100 1409
001 1004 683928.0 9858047.0 200 1409
001 1004 683928.0 9858047.0 300 1409
001 1004 683928.0 9858047.0 400 1409
001 1007 683928.0 9858047.0 4 1409
001 1007 683928.0 9858047.0 100 1409
001 1007 683928.0 9858047.0 200 1409
001 1007 683928.0 9858047.0 300 1409
001 1007 683928.0 9858047.0 400 1409
001 1007 683928.0 9858047.0 500 1409
002 1000 683928.0 9858047.0 4 1409
002 1000 683928.0 9858047.0 100 1409
002 1000 683928.0 9858047.0 200 1409
002 1000 683928.0 9858047.0 300 1409
002 1000 683928.0 9858047.0 400 1409
002 1000 683928.0 9858047.0 500 1409
002 1003 683928.0 9858047.0 4 1409
002 1003 683928.0 9858047.0 100 1409
002 1003 683928.0 9858047.0 200 1409
002 1003 683928.0 9858047.0 300 1409
002 1003 683928.0 9858047.0 400 1409
002 1003 683928.0 9858047.0 500 1409
etc...
Here's my starter script:
BEGIN { FS = " " }
NR == 1 {
prev_field1=$1
prev_field2=$2
}
$1 == prev_field1 && $2 == prev_field2
{
print $0
prev_field1=$1
prev_field2=$2
field_2_count++
}
$1 == prev_field1 && $2 != prev_field2
{
field_2_count++
next;
}
I'm stumped.
Thanks for any help.
001 1001 683784.0 9858088.0 4 1412
001 1001 683784.0 9858088.0 100 1412
001 1001 683784.0 9858088.0 200 1411
001 1001 683784.0 9858088.0 300 1411
001 1001 683784.0 9858088.0 400 1411
001 1002 683928.0 9858047.0 4 1409
001 1002 683928.0 9858047.0 100 1409
001 1002 683928.0 9858047.0 200 1409
001 1002 683928.0 9858047.0 300 1409
001 1002 683928.0 9858047.0 400 1409
001 1003 683928.0 9858047.0 4 1409
001 1003 683928.0 9858047.0 100 1409
001 1003 683928.0 9858047.0 200 1409
001 1003 683928.0 9858047.0 300 1409
001 1003 683928.0 9858047.0 400 1409
001 1004 683928.0 9858047.0 4 1409
001 1004 683928.0 9858047.0 100 1409
001 1004 683928.0 9858047.0 200 1409
001 1004 683928.0 9858047.0 300 1409
001 1004 683928.0 9858047.0 400 1409
001 1005 683928.0 9858047.0 4 1409
001 1005 683928.0 9858047.0 100 1409
001 1005 683928.0 9858047.0 200 1409
001 1005 683928.0 9858047.0 300 1409
001 1006 683928.0 9858047.0 4 1409
001 1006 683928.0 9858047.0 100 1409
001 1006 683928.0 9858047.0 200 1409
001 1006 683928.0 9858047.0 300 1409
001 1006 683928.0 9858047.0 400 1409
001 1006 683928.0 9858047.0 500 1409
001 1007 683928.0 9858047.0 4 1409
001 1007 683928.0 9858047.0 100 1409
001 1007 683928.0 9858047.0 200 1409
001 1007 683928.0 9858047.0 300 1409
001 1007 683928.0 9858047.0 400 1409
001 1007 683928.0 9858047.0 500 1409
002 1000 683928.0 9858047.0 4 1409
002 1000 683928.0 9858047.0 100 1409
002 1000 683928.0 9858047.0 200 1409
002 1000 683928.0 9858047.0 300 1409
002 1000 683928.0 9858047.0 400 1409
002 1000 683928.0 9858047.0 500 1409
002 1001 683928.0 9858047.0 4 1409
002 1001 683928.0 9858047.0 100 1409
002 1001 683928.0 9858047.0 200 1409
002 1001 683928.0 9858047.0 300 1409
002 1001 683928.0 9858047.0 400 1409
002 1001 683928.0 9858047.0 500 1409
002 1002 683928.0 9858047.0 4 1409
002 1002 683928.0 9858047.0 100 1409
002 1002 683928.0 9858047.0 200 1409
002 1002 683928.0 9858047.0 300 1409
002 1002 683928.0 9858047.0 9600 2400
002 1002 683928.0 9858047.0 9700 2401
002 1002 683928.0 9858047.0 10000 2478
002 1003 683928.0 9858047.0 4 1409
002 1003 683928.0 9858047.0 100 1409
002 1003 683928.0 9858047.0 200 1409
002 1003 683928.0 9858047.0 300 1409
002 1003 683928.0 9858047.0 400 1409
002 1003 683928.0 9858047.0 500 1409
...etc
Field 1 is a grouping number - not necessiarly incrementing by one.
Field 2 is a sub-number of field 1 with several sets of data following in the rest of the data record, also not necessiarly incrementing by one.
I want to parse the data so that I would have the output thinned by the combination of field 1 and 2.
I want to thin the ouput by printing every 3rd group of field 2 as related to field one.
Thus the ouput would be like:
001 1001 683784.0 9858088.0 4 1412
001 1001 683784.0 9858088.0 100 1412
001 1001 683784.0 9858088.0 200 1411
001 1001 683784.0 9858088.0 300 1411
001 1001 683784.0 9858088.0 400 1411
001 1004 683928.0 9858047.0 4 1409
001 1004 683928.0 9858047.0 100 1409
001 1004 683928.0 9858047.0 200 1409
001 1004 683928.0 9858047.0 300 1409
001 1004 683928.0 9858047.0 400 1409
001 1007 683928.0 9858047.0 4 1409
001 1007 683928.0 9858047.0 100 1409
001 1007 683928.0 9858047.0 200 1409
001 1007 683928.0 9858047.0 300 1409
001 1007 683928.0 9858047.0 400 1409
001 1007 683928.0 9858047.0 500 1409
002 1000 683928.0 9858047.0 4 1409
002 1000 683928.0 9858047.0 100 1409
002 1000 683928.0 9858047.0 200 1409
002 1000 683928.0 9858047.0 300 1409
002 1000 683928.0 9858047.0 400 1409
002 1000 683928.0 9858047.0 500 1409
002 1003 683928.0 9858047.0 4 1409
002 1003 683928.0 9858047.0 100 1409
002 1003 683928.0 9858047.0 200 1409
002 1003 683928.0 9858047.0 300 1409
002 1003 683928.0 9858047.0 400 1409
002 1003 683928.0 9858047.0 500 1409
etc...
Here's my starter script:
BEGIN { FS = " " }
NR == 1 {
prev_field1=$1
prev_field2=$2
}
$1 == prev_field1 && $2 == prev_field2
{
print $0
prev_field1=$1
prev_field2=$2
field_2_count++
}
$1 == prev_field1 && $2 != prev_field2
{
field_2_count++
next;
}
I'm stumped.
Thanks for any help.