Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

match by key, same code for files "same format", one works one fails 1

Status
Not open for further replies.

will27

Technical User
Jun 13, 2007
23
US
Dear all,
I use following codes to join two tab delimited files and print out wanted columns. for both files, the key is at the 4th cols. the codes work for following two files, let me call them "wf1" and "wf2"
first the codes:

awk -F"\t" '{while((getline<"'wf2'")>0) f2[$4]=$4;OFS="\t"} {if (f2[$4]) print f2[$4],$1,$4}' wf1

# wf1
meta weight data country desc
gold 1 1986 USA American Eagle
gold 1 1908 Austria-Hungary Franz Josef 100 Korona
silver 10 1981 USA ingot
gold 1 1984 Switzerland ingot
gold 1 1979 RSA Krugerrand
#wf2
meta weight data country desc
GOLD 1 1979 RSA Krugerrand
GOLD 1 1908 Austria-Hungary Franz Josef 100 Korona
SILVER 10 1981 USA ingot
GOLD 1 1984 Switzerland ingot
GOLD 1 1988 Canada Maple Leaf
PLATI 100 2000 PRC PANDA-MATCH
# the output
USA gold USA
Austria-Hungary gold Austria-Hungary
USA silver USA
Switzerland gold Switzerland
RSA gold RSA


However, it doesn't work for following two files, let me call them "ff1","ff2"

codes: (for this two tab delimited files, key is the first col for both file, therefore $4 changed to $1, and I want all cols in all of them, so " f2[$1]=$0", and "print f2[$1],$0 " )

awk -F"\t" '{while((getline<"'ff2'")>0) f2[$1]=$0;OFS="\t"} {if (f2[$1]) print f2[$1],$0}' ff1

# ff1
CAS NA 3
IBM 6 5
ATX 7 NA 6
XXX 8 NA
BLC NA
ACN 9 10 7
AGE 10 NA
SMC 11 12 8
CAS 12 13 9

# ff2
AGE 13 0
BLC 13 1
CAS 7 0
SMC 6 1
ATX 6 1

# output
ATX 6 1 ATX 6 1

the match failed and the output weird, I've been drived nuts by this, could you help me out.
Thank you

Warm Regards
Will



 
You wanted this ?
awk -F'\t' 'NR==FNR{f2[$1]=$0;next}$1 in f2{print f2[$1]"\t"$0}' ff2 ff1

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Thanks for your quick response, PHV
however, it won't work, I also tried some alternative of your codes, get nowhere.
maybe I should simply try unix "join", though it is not wise to sort the file.

will
 
it won't work
Here the output with your sample:[tt]
CAS 7 0 CAS NA 3
ATX 6 1 ATX 7 NA 6
BLC 13 1 BLC NA
AGE 13 0 AGE 10 NA
SMC 6 1 SMC 11 12 8
CAS 7 0 CAS 12 13 9
[/tt]So, what's wrong ?
 
Hi, PHV
I cut and pasted the code, the problem is no output at all!
when apply the same code to "wf1" and "wf2", the output is as following

meta weight data country desc meta weight data country desc

only title, nothing else.
I do think your code is right, I checked all possible reason within the reach of my limited knowledge, including if the FS is tab, still no output.
can you suggest any likely reason for such problem?

Thank you and regards
will
 
only title, nothing else
It's normal, due the case diff.
What about this ?
awk 'NR==FNR{f2[toupper($1)]=$0;next}toupper($1) in f2{print f2[toupper($1)]"\t"$0}' wf2 wf1

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Hi,PH
this time it works, thanks for all your help.

One last question, this might have something to do with my own code.
what's the default for the array to subscrbe itself? I printed out all the subs, they looks the same as in the key on all dimensions (of course the unque elements of the key).
then how the match process can possibly malfunction without transforming them using toupper or tolower, such as in this case.

regards
will
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top