Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Skipping duplicate fields before printing

Status
Not open for further replies.

learningawk

Technical User
Oct 15, 2002
36
US
I have a coordinate file that contains vector points that draw closed polygons. The first and last data point have the same coordinate location. I am reformatting the file to be used for input into another application.

Here's a test data set with the first record a column counter.

1234567890123456789012345678901234567890

001sdfsdfsdfsdfsdfsdfsdfsdfsdfsdfsdfsdfs
300000 1111111 22222 34343434 0
32222 34343434 33333 44444444 0
33333 44444444 34343 55555555 0
334343 55555555 66666 77777777 0
366666 77777777 300000 1111111 9
002 cccccccccceeeeeeeeeeeggggg
231xcxcxcxcxcxcxcxcxczxjfljdfladflasdflk
300000 1111111 22222 34343434 0
32222 34343434 33333 44444444 0
33333 44444444 34343 55555555 0
334343 55555555 66666 77777777 0
366666 77777777 300000 1111111 9
002 cccccccccceeeeeeeeeeeggggg
231xcxcxcxcxcxcxcxcxczxjfljdfladflasdflk
300000 1111111 22222 34343434 0
32222 34343434 33333 44444444 0
33333 44444444 34343 55555555 0
334343 55555555 66666 77777777 0
366666 77777777 300000 1111111 9


Column 1 is the record type identifier either a 0, 2 or 3.
0 is header, 2 is an ignored or skipped record and 3 is for the data points. I am using substr to pick the fields and I am trying to omit the duplicate locations before printing using a simple check if previous value = current value.

These groups contain a varied amount of coordinate pairs to describe the polygon.

Here's how the output should be:
001 some headers....
3 300000 1111111
3 22222 34343434
3 33333 44444444
3 34343 55555555
3 66666 77777777
3 300000 1111111

and so on for each group in the file.

Can you skip a substr field if it is found to be a duplicate of a previous record but yet keep the next pair of cordinates on that same record?

Thanks,
 
sed script, is it ok?

/^2/{
d
}
/^3/{
s/^\([^\ ]*\) *\([^\ ]*\) *.*$/3 \1 \2/
}

tikual
 
Awk script :

[tt]
awk '
/^2/ { # Record type "2"
next # Skip
}
/^3/ { # Record type "3"
if (prv1 == $1 && # Same point as previous ?
prv2 == $2) #
next; # Yes, skip
prv1 = $1; # Memorize point coord
prv2 = $2; #
sub("^\(.\)","& "); # Isolate record type
print $1,$2,$3; # Print Record type and coord
next; # Next record
}
{ # All other record types
print; # Print
}
' <input >output
[tt]


With you datas, the result is :
[tt]
001sdfsdfsdfsdfsdfsdfsdfsdfsdfsdfsdfsdfs
3 00000 1111111
3 2222 34343434
3 3333 44444444
3 34343 55555555
3 66666 77777777
002 cccccccccceeeeeeeeeeeggggg
3 00000 1111111
3 2222 34343434
3 3333 44444444
3 34343 55555555
3 66666 77777777
002 cccccccccceeeeeeeeeeeggggg
3 00000 1111111
3 2222 34343434
3 3333 44444444
3 34343 55555555
3 66666 77777777
[/tt]

Jean Pierre.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top