Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Data file reformat 1

Status
Not open for further replies.

learningawk

Technical User
Oct 15, 2002
36
US
I have a data file of coordinates that need to be evaluated and reformated.
Here's a snippet of data from an input file.
>,T4N,R10W,S1,0017
-98.212145, 34.840612
-98.194615, 34.840651
-98.194553, 34.855014
-98.198131, 34.855031
-98.200526, 34.855012
-98.212111, 34.855093
-98.212145, 34.840612
>,T4N,R10W,S2,0017
-98.229935, 34.840660
-98.212145, 34.840612
-98.212111, 34.855093
-98.215713, 34.855103
-98.229883, 34.855026
-98.229935, 34.840660
etc.

I need to get 4 points that describe the min/max of the corners of the data and then repeat the first corner point as the last point in the reformated data. Some of the groups of data have more than 5 points but I only want the 4+last point repeated. The min max of the corner points need to be identified as a pair of original coordinate not of individual x or y min/max.

The output should look like this for the first group of points: xcord ycord T4N R10W 1 corner_id

IMPORTANT NOTE:
The first OUTPUT point is described as the upper right coordinate then clockwise point 2 is lower right, point 3 lower left, point 4 upper left and point 5 repeated as point 1

T4N,R10W,S1,0017
-98.194553 34.855014 T4N R10W 1 1
-98.194615 34.840651 T4N R10W 1 2
-98.212145 34.840612 T4N R10W 1 3
-98.212111 34.855093 T4N R10W 1 4
-98.194553 34.855014 T4N R10W 1 5

The road block I am up against is how to get the individual xy pairs assigned to the 4 corner points.

Thank you for helping solve this.
 
This may do what you want. I assumed that the corner points required were the ones closest to the corners of the smallest x-y rectangle which would enclose the points.
Code:
function pp() {
  minx = maxx = x[1]
  miny = maxy = y[1]
  for (i=2;i<=ix;i++) {
    if (x[i]>maxx) maxx = x[i]
    if (x[i]<minx) minx = x[i]
    if (y[i]>maxy) maxy = y[i]
    if (y[i]<miny) miny = y[i]
  }
  px[1] = px[2] = px[3] = px[4] = x[1]
  py[1] = py[2] = py[3] = py[4] = y[1]
  dminsq1 = (maxx-x[1])^2 + (maxy-y[1])^2
  dminsq2 = (maxx-x[1])^2 + (miny-y[1])^2
  dminsq3 = (minx-x[1])^2 + (miny-y[1])^2
  dminsq4 = (minx-x[1])^2 + (maxy-y[1])^2
  for (i=2;i<=ix;i++) {
    dsq1 = (maxx-x[i])^2 + (maxy-y[i])^2
    dsq2 = (maxx-x[i])^2 + (miny-y[i])^2
    dsq3 = (minx-x[i])^2 + (miny-y[i])^2
    dsq4 = (minx-x[i])^2 + (maxy-y[i])^2
    if (dsq1<dminsq1) {
      px[1] = x[i]
      py[1] = y[i]
      dminsq1 = dsq1
    }
    if (dsq2<dminsq2) {
      px[2] = x[i]
      py[2] = y[i]
      dminsq2 = dsq2
    }
    if (dsq3<dminsq3) {
      px[3] = x[i]
      py[3] = y[i]
      dminsq3 = dsq3
    }
    if (dsq4<dminsq4) {
      px[4] = x[i]
      py[4] = y[i]
      dminsq4 = dsq4
    }
  }
  px[5] = px[1]
  py[5] = py[1]
  for (i=1;i<=5;i++) {
    printf(&quot;%10.6f%10.6f &quot;, px[i],py[i])
    print s,ix2,i
  }
}
BEGIN { FS=&quot;,&quot; }
{
  if ($0 ~ /^>/) {
    if (ix2 > 0) {
      pp()
    }
    s = $2 &quot; &quot; $3
    ix = 0
    ix2++
  }
  else {
    ix++
    x[ix] = $1+0
    y[ix] = $2+0
  }
}
END { 
  pp()
}
CaKiwi
 
CaKiwi,

I am very impressed how your code works. Thank You

Could we tweak it a little? The fourth field on the header of each group is a random number S1 through S36 (not incrementing) and not a sequential 1 - number of groups in a file. What ever is in that field just make it an interger.

Your script works perfect on the input I supplied, but if I use it on cartesian coordinates such as:
>,T99N,R36W,S36,0017
0 0
1 5
0 10
1 9
9 9
10 10
10 1
10 0
1 1
0 0
It doesn't start point number 1 with the coordinates as 10 10, point 2 should be 10 0, point 3 = 0 0, point 4= 0 10 and lastly point 5 = 10 10.

when I run your script the output is
10.000000 0.000000 T99N R36W 3 1
10.000000 0.000000 T99N R36W 3 2
0.000000 0.000000 T99N R36W 3 3
0.000000 0.000000 T99N R36W 3 4
10.000000 0.000000 T99N R36W 3 5

Thanks again for the help, its amazing.
 
Shouldn't the data have a comma between the x and y coordinate? I created a file

>,T99N,R36W,S36,0017
0 , 0
1 , 5
0 , 10
1 , 9
9 , 9
10 , 10
10 , 1
10 , 0
1 , 1
0 , 0

and got output

10.000000 10.000000 T99N R36W 36 1
10.000000 0.000000 T99N R36W 36 2
0.000000 0.000000 T99N R36W 36 3
0.000000 10.000000 T99N R36W 36 4
10.000000 10.000000 T99N R36W 36 5

Here's a slightly improved version which uses the data from the 4th field of the header.
Code:
function pp() {
  minx = maxx = x[1]
  miny = maxy = y[1]
  for (i=2;i<=ix;i++) {
    if (x[i]>maxx) maxx = x[i]
    if (x[i]<minx) minx = x[i]
    if (y[i]>maxy) maxy = y[i]
    if (y[i]<miny) miny = y[i]
  }
  for (i=1;i<=ix;i++) {
    dsq[1] = (maxx-x[i])^2 + (maxy-y[i])^2
    dsq[2] = (maxx-x[i])^2 + (miny-y[i])^2
    dsq[3] = (minx-x[i])^2 + (miny-y[i])^2
    dsq[4] = (minx-x[i])^2 + (maxy-y[i])^2
    for (j=1;j<=4;j++) {
      if (i==1 || dsq[j]<dminsq[j]) {
        px[j] = x[i]
        py[j] = y[i]
        dminsq[j] = dsq[j]
      }
    }
  }
  px[5] = px[1]
  py[5] = py[1]
  for (i=1;i<=5;i++) {
    printf(&quot;%10.6f%10.6f &quot;, px[i],py[i])
    print s,i
  }
}
BEGIN { FS=&quot;,&quot; }
{
  if ($0 ~ /^>/) {
    if (NR > 1) {
      pp()
    }
    s = $2 &quot; &quot; $3 &quot; &quot; substr($4,2)
    ix = 0
  }
  else {
    ix++
    x[ix] = $1+0
    y[ix] = $2+0
  }
}
END {
  pp()
}
CaKiwi
 
OOPS!
You're right, I forgot the comma for the field separator.
I really appreciate your help on this CaKiwi.
It works perfect.
Thank you
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top