Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Thinning a coordinate listing

Status
Not open for further replies.

portrush

Technical User
Apr 4, 2005
1
US
I have several ASCII text files which have 4 fields comma delimited.

The contents and format of each file is:

Point_Id,Easting_Coordinate,Northing_Cordinate,Elevation

The files have up to 10+ million records, so I presume that the computing time will be quite long.

I only have experience writing AWK one-liners.
I need the (presumed elaborate) coding of an AWK script that will perform the following on these files:

The objective is to delete points/records that are closer to each other than a user-defined-distance.

Some of the computing logic might be as follows:

1. First, using the Easting and Northing coordinates of each point, compute the distance between the FIRST point/record and all of the following points in the list, and DELETE those points whose distance from the FIRST point is less than the USER_DEFINED_DISTANCE.
2. You now might have to write the results list to a new file.
3. Next, using the Easting and Northing coordinates of each point, compute the distance between the SECOND point (in the new list) and all of the following points in the list, and DELETE those points whose distance from the SECOND point is less than the USER_DEFINED_DISTANCE.
4. You now might have to write the results list to a new file.
5. Next, using the Easting and Northing coordinates of each point, compute the distance between the THIRD point (in the next new list) and all of the following points in the list, and DELETE those points whose distance from the THIRD point is less than the USER_DEFINED_DISTANCE.

And so on, and so on.......... until you have reached the last point in the list.

The end result needs to be a comma delimited ASCII file of the same format as the input file.

If there is some other programming language that can produce the same results - please feel free to share. :)

Good Luck.

And Thank You,
Kenny.
 
The files have up to 10+ million records
I'd consider a real database instead of a scripting language against a huge flat file ...

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top