Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Reading the third file and sub 3

Status
Not open for further replies.

frangac

Technical User
Feb 8, 2004
163
ZA
Hi All,

Below is the code which I am busy with but am stuck with trying to read the third file.
What I would like to do is to use (this value------->) as indicated below and sub into the third file
Can anyone assist me with this issue.

awk 'FNR==NR {
a[$3,$4,$5,$6]=$2
next
}
{
idx=$1 SUBSEP $5 SUBSEP $11 SUBSEP $12
if ( idx in a) {
$2 = a[idx]
this value-------> print $2
}
}' database $1.temp

Many Thanks
Chris
 
Not 100% sure I understand what you want, I'm guessing you just want to send all of the output (from print $2) to a third file? Why not just use:

Code:
awk  'FNR==NR {
                      a[$3,$4,$5,$6]=$2
                      next
                      }
                      {
                      idx=$1 SUBSEP $5 SUBSEP $11 SUBSEP $12
                      if ( idx in a) {
                      $2 = a[idx] 
                              print $2
                   }
                   }' database $1.temp > thirdfile

Annihilannic.
 
Hi Annihilannic,

Thanks for your responce.

What I am trying to achive is the following

When the result is found (the result ===>) read the third file and sub the result into it for eg



After getting the common line the result is "019044"
File2= $1.temp
269 20060124100409 0 381 0827733949 E Gd S -1 -1 S4+HAB JNLHAB

Now in the third file
Search for "/^128 A/||/^248 A/||/^129 A/{a=NR}" go to the next line opt=1 which is 269 and replace it
with "019044" and when it go to the next line do the same thing etc


awk -v File=$1 -v opt=1 -v Num=6 -v Num2=$3 -v Num3=D 'FNR==NR {
a[$3,$4,$5,$6]=$2
next
}
{
idx=$1 SUBSEP $5 SUBSEP $11 SUBSEP $12
if ( idx in a) {
the result ===> $2 = a[idx]# OFS $1
while ( getline < File > 0 )
/^128 A/||/^248 A/||/^129 A/{a=NR}
a&&NR==(a+opt){
{$1=slen=length($2);$2=$2;$3="A"}
}
# print idx,$2
}
}' database $1.temp


Hope that make sence
Many Thanks
Chris
 
It's difficult to debug your code without sample data, however I found the layout very confusing and have changed it a little.

Code:
awk -v File=$1 -v opt=1 -v Num=6 -v Num2=$3 -v Num3=D  '
FNR==NR {
        a[$3,$4,$5,$6]=$2
        next
}
{
        idx=$1 SUBSEP $5 SUBSEP $11 SUBSEP $12
        if ( idx in a) {
                #the result ===>
                $2 = a[idx]# OFS $1
                while ( getline < File ) {
                        if ($0 ~ /^(128|248|129) A/) { a=NR }
                        if (a&&NR==(a+opt)) {
                                $1=slen=length($2)
                                $2=$2
                                $3="A"
                        }
                }
                close(File)
                # print idx,$2
        }
}' database $1.temp

I don't think you can use the implicit /re/ { commands } syntax in a while (getline) loop, you have to actually put ifs in (correct me if I'm wrong awkophiles). Also if you are continually re-reading the same file you have to close it each time to make it read from the beginning the next time. No need for the > 0 after the getline, it is implicit.

Hopefully that will get you heading in the right direction, otherwise please post some sample data from both files and an example of the command-line parameters (i.e. the values of $1 and $3 supplied to the wrapper script).

Annihilannic.
 
No need for the > 0 after the getline
WRONG: getline returns -1 for an error !

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
Well, I didn't know that! I always assumed that it would evaluate to 'false' if the return value was any non-zero. Thanks for the correction.

Annihilannic.
 
Hi Annihilannic, PHV,

Thanks a lot for your responce.

Annihilannic
I receive an the following error mesg

+ awk -v File=$1 -v opt=1 -v Num=6 -v Num2= -v Num3=D
FNR==NR {
a[$3,$4,$5,$6]=$2
next
}
{idx=$1 SUBSEP $5 SUBSEP $11 SUBSEP $12
if ( idx in a) {
$2 = a[idx]# OFS $1
while ( getline < File ) {
if ($0 ~ /^(128|248|129) A/) { a=NR }
if (a&&NR==(a+opt)) {
$1=slen=length($2)
$2=$2
$3="A"
}
}
close(File)
}
} database File_1.temp
awk: Cannot read the value of a. It is an array name.
The input line number is 1. The file is $2.
The source line number is 1.


Many Thanks
Chris
 
You should probably use a different variable name later on (e.g. b=NR and if (b&&NR==(b+opt))).

Annihilannic.
 
Hi Annihilannic, PHV,

Again Thanks for your assistance.

It seems that it is not doing the sub. I am using a print out and the result is the same as the orig file. It find that specific line and does nothing.

while ( getline < File ) {
if ($0 ~ /^(128|248|129) A/) { a=NR }
if (a&&NR==(a+opt)) {
$1=slen=length($2)
$2=$2
$3="A"
}
print $0
}
close(File)
}
} database File_1.temp > $1_Changed

What am I not doing thats right.

Thanks
Chris
 
Not really sure what you're trying to do here... why not just use the sub() function?

[tt]$1=slen=length($2)
$2=$2
$3="A"[/tt]


Annihilannic.
 
Hi Annihilannic,

Thanks

"Why not just use the sub() function?" Previously it was PHV that explained to me that by using $1=slen=length($2);$2=$2;$3="A" you are actually using the sub function. My main reason to do this is because the one file
File2= $1.temp
269 20060124100409 0 381 0827733949 E Gd S -1 -1 S4+HAB JNLHAB
is in text as well as the file database which i am refrencing and extracting a[idx]. Then use file3 to search for pattern $0 ~/^128/129 , read the next line (opt=1) which looks like "3 269 A" then sub $1 with the length of a[idx] and in $2 sub with a[idx] and a constant "A". The result will look like "6 019044 A" and the save the whole file with its chnages.

FILE 3
======
<AT 1.1 LV 1.1 NT 3.4.0.0 >O
b
J
2 Q
3 CDR b
J
214 C
S7 A
1 G D
1 D
0 D
0 D
0 D
0 D
0 D
0 D
0 D
128 A
3 269 A
1 2 A
1 6 A
1 9 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 D
0 D
0 D
0 A
0 M
1138089849 0 D etc......


Hope that helps

Thanks
Chris
 

Hi Again

Here are my three files.

First file
==========
269 20060124100409 0 381 0827733949 E Gd S -1 -1 S4+HAB JNLHAB


database
=========
xxxxxxxxxxxxxxx 019044 269 0827733949 S4+HAB JNLHAB


FILE 3
======
<AT 1.1 LV 1.1 NT 3.4.0.0 >O
b
J
2 Q
3 CDR b
J
214 C
S7 A
1 G D
1 D
0 D
0 D
0 D
0 D
0 D
0 D
0 D
128 A
3 269 A
1 2 A
1 6 A
1 9 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 D
0 D
0 D
0 A
0 M
1138089849 0 D etc......


and expected result

FILE 3
======
<AT 1.1 LV 1.1 NT 3.4.0.0 >O
b
J
2 Q
3 CDR b
J
214 C
S7 A
1 G D
1 D
0 D
0 D
0 D
0 D
0 D
0 D
0 D
128 A
6 019044 A
1 2 A
1 6 A
1 9 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 A
0 D
0 D
0 D
0 A
0 M
1138089849 0 D etc......


Hope this will help

Thanks
Chris

 
That makes it a bit clearer, try this perhaps:

Code:
FNR==NR {
        a[$3,$4,$5,$6]=$2
        next
}
{
        idx=$1 SUBSEP $5 SUBSEP $11 SUBSEP $12
        if ( idx in a) {
                #the result ===>
                $2 = a[idx]# OFS $1
                while ( getline < File > 0 ) {
                        print
                        if ($0 ~ /^(128|248|129) A/) {
                                # skip over 'opt' lines
                                for (i=opt; i>1 && getline < File > 0; i--) { print }
                                if (getline < File > 0) {
                                        # change value
                                        $1=length(a[idx])
                                        $2=a[idx]
                                        $3="A"
                                        print
                                }
                        }
                }
                close(File)
        }
}'

Annihilannic.
 
Hi Annihilannic,

Thanks. This is getting better. The result is correct but what I have noticed is that because File_1.temp
has 12 entries it sub 12 times of each number in the File 3. In other words it does the sub for the first 12 numbers and then repeats itself for another 12 times until the last number. I have put in print statement to show you what I mean.

019049 20060124100409 0 381 12345 E Gd S -1 -1 S4+HAB JNLHAB
019049 20060124100515 0 72 029100 E Gd S -1 -1 S4+HAB JRBHAB
019049 20060124100741 0 274 0005703728 E Gd S -1 -1 S4+HAB JNLHAB
019049 20060124100815 0 63 0004000051 E Gd S -1 -1 S4+HAB JRBHAB
019049 20060124100832 0 168 0002871340 E Gd S -1 -1 S4+HAB JNLHAB
019049 20060124101219 0 122 0000701000 E Gd S -1 -1 S4+HAB JRBHAB
019049 20060124101506 0 18 0000522096 E Gd S -1 -1 S4+HAB JGMHAB
019049 20060124101924 0 67 0000806222 E Gd S -1 -1 S4+HAB JRBHAB
019049 20060124102041 0 112 0000001000 E Gd S -1 -1 S4+HAB JRBHAB
019049 20060124102119 0 349 0000000113 E Gd S -1 -1 S4+HAB JDFHAB
019049 20060124102248 0 41 0000001174 E Gd S -1 -1 S4+HAB JNLHAB
0190260 20060124104613 0 55 0000000399 E Gd S -1 -1 S4+HAB JNLHAB
0190260 20060124104707 0 24 0000078408 E Gd S -1 -1 S4+HAB JDFHAB
0190260 20060124105416 0 76 0114216773 E Gd S -1 -1 S4+HAB JGMHAB
0190260 20060124100409 0 381 12345 E Gd S -1 -1 S4+HAB JNLHAB
0190260 20060124100515 0 72 029100 E Gd S -1 -1 S4+HAB JRBHAB
0190260 20060124100741 0 274 0005703728 E Gd S -1 -1 S4+HAB JNLHAB


How can you prevent this from happening. You deserve a star.

Many Thanks Once Again
Chris

 
Hi Annihilannic,PHV

Please assist. This has me going around in circles.

Many Thanks
Chris
 
I couldn't really make sense of the output to be honest... what is the significance of the numbers in bold? You said "it sub 12 times of each number", however you have highlighted only 6 lines in bold? And 019049 appears 11, not 12 times...

Is there too much data in File_1.temp and database to post here?

Where exactly did you put in that print statement?

Annihilannic.
 
Hi Annihilannic,

Thanks

File_1.temp and database are not big files. File_1 is the same as File. The only diffrence is that I have extracted the nessary info for normal text with a tool and File is as shown above so thefore if File_1 or File has 11 entries the while loop will sub the 11 entries and when the while loop picks up the next a[idx] it will repeat the sub again to the same numbers and so on. What I am expecting is that when the first a[idx] is found read File and search for pattern, jump one line and then sub. Wait till the next a[idx] and then continue the search pattern in File and then sub and so on

019044 1st find (between file_1 database) for the first sub
019044
0190260
019049
019044
0190260
019044
0190260
0190260
019044
0190260
0190260
0190260
019044
0190260
0190260
019044
0190260
0190260
019044
0190260
0190260
0190260
0190260
0190260
0190260
0190260
0190260

If this does not make sence I will have to post the files

Many Thanks
Chris
 
Hi Annihilannic,

Is the explanation clear enough or should I try and post all the files.


Many Thanks
Chris
 
Code:
# The first file on the command line has 6 fields.
# Reading first file?
ARGV[1]==FILENAME {
  a[$3,$4,$5,$6] = $2
  next
}

# The 2nd file on the command line has 12 fields.
ARGV[2]==FILENAME {
  idx = $1 SUBSEP $5 SUBSEP $11 SUBSEP $12
  if ( idx in a)
    list[ ++count ] = a[ idx ]
  next
}

# We're reading the 3rd file.

FNR == 1  { count = 0 }

/^(128|248|129) A/  {  target = FNR + opt }

FNR == target {
  replacement = list[ ++count ]
  $1 = length( replacement )
  $2 = replacement
}

8
 
Hi Futurlet,

Welkom back and thanks. I get the following error mesg

awk: Cannot find or open file reading.
The source line number is 22.
./.IAN_S4HAB_main2[30]: FNR: not found
./.IAN_S4HAB_main2[30]: syntax error at line 32 : `(' unexpected
Exit 2

What am I not doing wright.

Many Thanks
Chris
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top