Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Compare two fields and create a third

Status
Not open for further replies.

mrimagepueblo

Programmer
Dec 15, 2003
52
US
I have two files that I would like to lookup id's for and add their corresponding values to the end of a master file. I can do this in Microsoft Access, but don't really want to.

The office file has a structure of Office ID|Office Name|Address|City|... etc.
The agent file has a structure of Office ID|Agent ID|First Name|Last Name|Address|City|... etc.

The office ids are unique.
The agent ids are not unique numbers

I got a script from Annihilannic a long time ago but it doesn't work with this particular situation.

Masterfile.txt (these field positions are not absolute)
xxx|xxx|xxx|123|xxx|xxx|567|xxx|xxx|
xxx|xxx|xxx|999|xxx|xxx|567|xxx|xxx|

Sample data for office (office.txt)
Office ID|Office Name|Office Address|...
123|Starbucks|123 Anywhere St|....
999|McDonalds|456 Main st|...

Sample data for agent (agent.txt)
Office ID|Agent ID|First Name|Last Name|...
123|567|Tom|Sawyer|...
999|567|Huck|Finn|...

here is what I have been using to match agent id's and appending the actual agent name to the end of the file.

awk -F'|' '
FILENAME==ARGV[1] { firstname[$1]=$2 }
FILENAME==ARGV[2] { print $0 firstname[$14] "|" }
' agent.txt masterfile.txt > masterfile_temp.txt

That statement works perfect when the agent ids are unique but put the first agent id it comes across in the file, ignoring any other matches

So my new output would be.
xxx|xxx|xxx|xxx|Starbucks|Tom|Sawyer|
xxx|xxx|xxx|xxx|McDonalds|Huck|Finn|

Any help would be greatly appreciated.




 
I presume the agent IDs are unique for each office IDs, in which case you could use both of the keys to your arrays, e.g.

Code:
 FILENAME==ARGV[1] { firstname[$1,$2] = $3 }

I'm a bit confused though because the field numbers in the sample code you posted don't correspond with the sample data.

Annihilannic
[small]tgmlify - code syntax highlighting for your tek-tips posts[/small]
 
you are correct, the agent id's are unique for each office id.

the firstname[$14] is the actual position in the masterfile of the agent id. sorry not like in the example
this is what I came up with but the field is empty in mastefile_temp.txt
basically I'm just changing the array built up based on your example

# insert First Name
awk -F'|' '
FILENAME==ARGV[1] { firstname[$1,$2] = $3 }
FILENAME==ARGV[2] { print $0 firstname[$14] "|" }
' agent.txt mastefile.txt > masterfile_temp.txt
 
Since you are now creating an array with multiple indices, you also need to look up the array using multiple indices. I'm assuming masterfield has the office id in there somewhere as well, let's say field $13 for example, in which case you need to use print $0 firstname[$13,$14].

Annihilannic
[small]tgmlify - code syntax highlighting for your tek-tips posts[/small]
 
that got it thank you. I really wish I could wrap my head around arrays for things like this. Do you have a url or a book you could recommend, preferably a dummies version. thank you very much

here is what I did

# insert First Name
awk -F'|' '
FILENAME==ARGV[1] { firstname[$1,$2] = $3 }
FILENAME==ARGV[2] { print $0 firstname[$14,$15] "|" }
' agent.txt masterfile.txt > masterfile_temp.txt
mv -f masterfile_temp.txt masterfile.txt
#
# insert Last Name
awk -F'|' '
FILENAME==ARGV[1] { lastname[$1,$2] = $4 }
FILENAME==ARGV[2] { print $0 lastname[$14,$15] "|" }
' agent.txt mastefile.txt > masterfile_temp.txt
mv -f masterfile_temp.txt masterfile.txt

 
Might as well do in one step:

Code:
# insert First Name
awk -F'|' '
FILENAME==ARGV[1] { firstname[$1,$2] = $3; lastname[$1,$2] = $4 }
FILENAME==ARGV[2] { print $0 firstname[$14,$15] "|" lastname[$14,$15] "|" }
' agent.txt masterfile.txt > masterfile_temp.txt
mv -f masterfile_temp.txt masterfile.txt

The gawk documentation is excellent, here is the part describing arrays.

Annihilannic
[small]tgmlify - code syntax highlighting for your tek-tips posts[/small]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top