Hey, all
I have been struggling with problem for quite a while, any hints of librating me from the fight will be highly appreciated.
first there is a file "pool", the first col is something with will be matched by some key
WorldComInc WorldCom, Inc.
second there is file "key_trace", the first three cols are keys to do the match.
WORLDCOMINCGA WORLDCOMINC WORLDCOM 11088 80 20000831 WCOM WORLDCOM INC GA NEW 98157D10
WORLDCOMINCGA WORLDCOMINC WORLDCOM 11042 61 19980831 WCOM WORLDCOM INC GA 98157D10
Here is the idea,
frist use "WORLDCOMINCGA" to match "WorldComInc", if matched; done
if not matched(this is the case in this example), then use "WORLDCOMINC" to match "WorldComInc", if matched; done
if not matched, finally use use "WORLDCOM" to match "WorldComInc". if matched, done;
if all three matching attempt failed, then print file"pool" and space filler for "key_trace"
This is the problematic code (NOTE: it is de facto a one-liner expanded for easy reading)
Code:
awk -F "\t" 'FILENAME == ARGV[1]
{ keys[$1,$2,$3] = $0; next }
{
for (key in keys) {
split(key,con,"\034");
if ( index( toupper($1),toupper(con[1]) ) ) {print $0,"\t",keys[key]; x++} };
if(x==0)
{if (index( toupper($1),toupper(con[2]) ) ) {print $0,"\t",keys[key]; y++} };
if(x==0&&y==0)
{if (index( toupper($1),toupper(con[3]) ) ) {print $0,"\t",keys[key]; z++} };
if(x==0&&y==0&&z==0)
{printf"%s",$0; for (i=1;i<=10;i++) {printf"%s",(i<10?"\t":"\n")} }; # 10=NF(file key_trace) +1
x=0;y=0;z=0;
}' key_trace pool > match_companyheader_man
expected result istwo lines, first three lines are actually one line, so is the last three)
WorldComInc WorldCom, Inc. WORLDCOMINCGA WORLDCOMINC WORLDCOM 11088 80 20000831 WCOM WORLDCOM INC GA NEW 98157D10
WorldComInc WorldCom, Inc. WORLDCOMINCGA WORLDCOMINC WORLDCOM 11042 61 19980831 WCOM WORLDCOM INC GA 98157D10
problem:
1. only one line printed:
WorldComInc WorldCom, Inc. WORLDCOMINCGA WORLDCOMINC WORLDCOM 11042 61 19980831 WCOM WORLDCOM INC GA 98157D10
2. if file "key_trace" very large, second and third key seems never used, in other words, the second and third "if" seems
never executed.
please let me know why the code above does not do what I intend to do.
thank you in advance
will