Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Help with data lookup table 1

Status
Not open for further replies.

learningawk

Technical User
Oct 15, 2002
36
US
Hi,
I'm new to awk/programming and is there an easy way to compare 2 files, one being a group of values that another file would look for similiar values and then grab 2 values from that file and store in file 2 for further processing?

My first data look up file is in the following format:

XXXX YYYY AA BB 1 1
XXXX YYYY AA BB 1 2
XXXX YYYY AA BB 1 3
XXXX YYYY AA BB 1 4
XXXX YYYY AA BB 1 5
XXXX YYYY DD CC 5 1
XXXX YYYY DD CC 5 2
XXXX YYYY DD CC 5 3
XXXX YYYY DD CC 5 4
XXXX YYYY DD CC 5 5
XXXX YYYY EE FF 4 1
XXXX YYYY EE FF 4 2
XXXX YYYY EE FF 4 3
XXXX YYYY EE FF 4 4
XXXX YYYY EE FF 4 5
XXXX YYYY EE FF 4 6

It consists of 3 groups of data for a specific entity.

My 2nd file looks like:

zzzzz,zzzzz,zzzzz,(NUMEROUS FIELDS),AA,BB,1,VAR1,VAR2,VAR3,VAR4
zzzzz,zzzzz,zzzzz,(NUMEROUS FIELDS),DD,CC,5,VAR1,VAR2,VAR3,VAR4
zzzzz,zzzzz,zzzzz,(NUMEROUS FIELDS),EE,FF,4,VAR1,VAR2,VAR3,VAR4

I would like to compare file 2 to file 1 and once you find a match in the file2 fields AA,BB,1 with similiar values in file 1 then retreive from file1 the xxxx and yyyy and store in file2. These values will then be used for further processing.

I would also like to check if during the compare process if the lookup table has more than 5 points per group, (such as in the last records in file 1)it would return some sort of alert that that is an irregular match.

Thank you for helping on my problem.
 
The only problem I see with Vlad's script is that he is testing for the matching fields from the end of the record and in a later post you said that they are fixed fields $10, $11, $12 and $13. If you change

idx=$(NF-7) $(NF-6) $(NF-5) $(NF-4)

to

idx=$10 $11 $12 $13

it should work. One other problem with his approach is that you could get some false matches depending on your input data e.g. AA,BB,25,2 would match AA,BB,2,52. Here's my adaptation of his script using multi-dimensional arrays.

BEGIN {
FS1 = " "
FS2 = ","

if (!file1) file1="file1"

FS = FS1
while ((getline < file1) > 0) {
a[$3,$4,$5,$6]=$1 FS2 $2;
}
FS = FS2
}

{
s = a[$10,$11,$12,$13]
if (s) print $0 FS2 s
}
CaKiwi
 
CaKiwi's script worked using GAWK

Thanks to everyone who helped on this and sorry I didn't post the exact data file in the beginning.

 
CaKiwi,
thanks for the help - I've been busy the whole day!

Appreciated it!

vlad vlad
+---------------------------+
|#include<disclaimer.h> |
+---------------------------+
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top