Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

awk array to add text to a file 1

Status
Not open for further replies.

mrimagepueblo

Programmer
Dec 15, 2003
52
0
0
US
I see another post that is almost what I'm looking for but thought I'd ask about my specific circumstance. Maybe between your input here and the other post I can figure out what I need.

I'd like to get away from using microsoft access to do these queries.
I know how to do this in a #!/bin/sh script using the sed command, but the contents of filea.txt change so that would mean hard coding in the variables each time for the subsitution. I'd like it to be dynamic since both filea.txt and fileb.txt change daily.

hard coded way.
perl -pi -e 's/\|1234\|Larry\|/g' fileb.txt

I'd rather append the change to the end of the file instead of replacing the data.

Employess Numbers and Names
filea.txt ( one variable id field 3 is unique,no duplicates on that field), however contents of filea.txt can change from time to time.

xxxx|xxxx|1234|Larry|xxxx|
xxxx|xxxx|3456|curly|xxxx|
xxxx|xxxx|7890|Moe|xxxx|

fileb.txt (employees time cards)
xxxx|xxxxx|xxxx|7890|xxxx|
xxxx|xxxxx|xxxx|3456|xxxx|
xxxx|xxxxx|xxxx|7890|xxxx|
xxxx|xxxxx|xxxx|1234|xxxx|

xxxx just represents miscellaneous data, can be of varying length and type.

I would like the result to be either in fileb.txt or it doesn't matter if a new file is created filec.txt

xxxx|xxxxx|xxxx|7890|xxxx|Moe|
xxxx|xxxxx|xxxx|3456|xxxx|Curly|
xxxx|xxxxx|xxxx|7890|xxxx|Moe|
xxxx|xxxxx|xxxx|1234|xxxx|Larry|

If someone can get me started on creating/reading in an array for filea.txt and then some code to work with for either appending to fileb.txt or creating a new filec.txt I would greatly appreciate it.
I have several files/situations where this would occur, so the field placements will be different. Ie. In another filea.txt the id could be field5. In another fileb.txt the code could be in field 10, etc.
 
Something like this:

Code:
awk -F'|' '
        FILENAME==ARGV[1] { employeename[$3]=$4 }
        FILENAME==ARGV[2] { print $0 employeename[$4] "|" }
' filea.txt fileb.txt > filec.txt

While processing the first argument it builds up an array of employee names based on their employee IDs, and then while processing the second argument it looks up that array and prints it appended to the input line.

Annihilannic.
 
Thank you so much, it's exactly the snippet I was looking for and I think I can even follow the logic. Not nearly as troublesome/cumbersome as I thought it would be.

I have a follow up question that has to do with the durability of the array. In a couple of circumstances fileb.txt could be rather large with 20,000 records with a total file size of 20mb. Filea.txt would never be over 600-700k with 20,000 records.

I have a Pentium 4 3.2ghz with 2gb ram I'll be running this script on. Do you think my server would handle it without crashing?

I could always cut out the needed fields and then paste back in the results at the end of the records if you think the array would crash the system. I would imagine I could cut out records 3000 or 4000 at a time.

Thanks again.
 
It very much depends on the version of awk you are using (there are many), but I'd just try it and see.

If you did happen to run into an awk limitation you could very easily convert the script to perl which claims to be limitless, within available system resources that is, and you have plenty.

Annihilannic.
 
GNU Awk 3.1.3

I'll try a progressive set of 3000 records at a time and report back and let you know.

Thanks again.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top