Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

parse some additional data from a file to another file 1

Status
Not open for further replies.

amili

Programmer
Nov 9, 2010
18
CA
Hi,

i need help on adding additional data from file 1 to file 2 as follows:-

_file1_

No 250-305 Id 1
No 144-146 Id 1
No 189-190 Id 2
No 1904-1909 Id 2
No 1962-1968 Id 2
No 221-222 Id 7
No 248-251 Id 8
No 257-258 Id 8

_file2_

ID 1
START S216
DONE D125
@@


the output should look like this:-

ID 1
START S216
DONE D125
No 250-305
No 144-146
@@
ID 2
START S325
DONE D340
No 189-190
No 1904-1909
No 1962-1968
@@

and so on..until ID 8. Need help on how to do this using awk. Thanks..


 
This assumes that both files contain the same ids in the same order. Change the file in both getlines to your first file and run it against the second file.
Code:
{
  print
  j++
  if (j==1) id2 = $NF
  if (j==3)
  {
    j=0
    if (flg==0)
    {
      flg=1
      if ((getline < "file1") < 1) exit
      a=$0
      id1=$NF
    }
    while (id1==id2)
    {
      print a
      if ((getline < "file1") < 1) exit
      a=$0
      id1=$NF
    }
  }
}

CaKiwi
 
Hi CaKiWi,

thanks for your prompt response..

the id in my first file does not follow the same order as in my second file. i did try your code but nothing happens, probably because of the order of the id? as of now, i'm still trying to do it..
 
Hi CakiWi,

i tried with a small set of actual data with the same order of id and it seems that it only works for the fist id and ignore the rest like this:-

ID 1
START S216
DONE D125
No 250-305
No 144-146
@@
ID 2
START S325
DONE D340
@@
ID 7
START S587
DONE D600
@@

if possible, could you pls explain to me briefly your code there? i don't really understand especially the flg and getline..tried to read in some of the website but still it is not that clear. :( thanks
 
This might be a better approach. It assumes that the lines for each id are together in each file but does not rely on order. Run with

awk -f file.awk file2 file1

Code:
{
  if (FNR==NR)
  {
    if (k==0)
    {
      id = $NF
      l1[id]=$0
    }
    else if (k==1)
      l2[id]=$0
    else
      l3[id]=$0
    k++
    if (k==3) k=0
    next
  }
  id = $NF
  if (id!=svid)
  {
    print l1[id]
    print l2[id]
    print l3[id]
    svid=id
  }
  print
}

CaKiwi
 
i tried the 2nd approach.. some of the existing data in file 2 are gone like the @@ symbol, the ID, START and DONE for certain ids. Not only that, i do have some info (comments) that i put in the file 2 before ID 1 and these comments gone too.
 
Sorry, I thought the @@ lines were just something you put in to show there was more data. Try this.
Code:
{
  if (FNR==NR)
  {
    if (k==0)
    {
      id = $NF
      l1[id]=$0
    }
    else if (k==1)
      l2[id]=$0
[COLOR=#ff0000]    else if (k==2)[/color]
      l3[id]=$0
    k++
[COLOR=#ff0000]    if (k==4) k=0[/color]
    next
  }
  id = $NF
  if (id!=svid)
  {
[COLOR=#ff0000]    if (flg) print "@@"
    flg=1[/color]
    print l1[id]
    print l2[id]
    print l3[id]
    svid=id
  }
  print
}
[COLOR=#ff0000]END {print "@@"}[/color]
I'll let you add code to handle the comments at the beginning of the second file

CaKiwi
 
Hi,

i've been trying to make it works but still failed...the output are a bit messed up but i am still working on this..
 
Post sample input files, expected output and actual output and I (or someone) will help you further.

CaKiwi
 
The samples for the actual input for 2 files and expected output as attached. i put them in a file as it is going to be a bit crowded to post directly here even though i already simplified them. Thanks
 
 http://www.mediafire.com/?q3u0ijwa9ik2o6k
Maybe this time. This assumes a maximum of 12 items per id in file1. Run with
awk -f file.awk file1 file2
Code:
/;;/{print;next}
{
  if (FNR==NR)
  {
    id = $NF
    if (id!=svid)
    {
      k=0
      svid=id
    }
    sub(/ *Id_.*$/,"")
    if (k==0)
      l1[id]=$0
    else if (k==1)
      l2[id]=$0
    else if (k==2)
      l3[id]=$0
    else if (k==3)
      l4[id]=$0
    else if (k==4)
      l5[id]=$0
    else if (k==5)
      l6[id]=$0
    else if (k==6)
      l7[id]=$0
    else if (k==7)
      l8[id]=$0
    else if (k==8)
      l9[id]=$0
    else if (k==9)
      l10[id]=$0
    else if (k==10)
      l11[id]=$0
    else if (k==11)
      l12[id]=$0
    k++
    ln[id]=k
    next
  }
  if ($0 ~ /^ID/) id = $NF
  if ($0 ~ /^@@/)
  {
    if (ln[id]>0) print l1[id]
    if (ln[id]>1) print l2[id]
    if (ln[id]>2) print l3[id]
    if (ln[id]>3) print l4[id]
    if (ln[id]>4) print l5[id]
    if (ln[id]>5) print l6[id]
    if (ln[id]>6) print l7[id]
    if (ln[id]>7) print l8[id]
    if (ln[id]>8) print l9[id]
    if (ln[id]>9) print l10[id]
    if (ln[id]>10) print l11[id]
    if (ln[id]>11) print l12[id]
  }
  print
}

CaKiwi
 
Sorry, I forgot awk has 2 dimensional arrays. This is simpler.
Code:
/;;/{print;next}
{
  if (FNR==NR)
  {
    id = $NF
    if (id!=svid)
    {
      k=0
      svid=id
    }
    sub(/ *Id_.*$/,"")
    ll[id,k]=$0
    k++
    ln[id]=k
    next
  }
  if ($0 ~ /^ID/) id = $NF
  if ($0 ~ /^@@/)
  {
    for (j=0;j<ln[id];j++) print ll[id,j]
  }
  print
}

CaKiwi
 

i run the latest code and it seems that it works for certain data only but i could see that it almost get to the expected output. i will continue working on this and see how it goes..Thanks so much CaKiWi for your big help :)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top