Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Join two files (sort of) 2

Status
Not open for further replies.

mwesticle

Programmer
Nov 19, 2003
51
US
I have an easy to semi-difficult problem here, and I was wondering if any of the good people at Tek-Tips could help me out... Here goes:

I have two files (File A and File B), each have records that are 100 bytes in length. There are 12-digit "keys" on each file, in positions 1-12. File A is sorted on this 12-digit "key". File A looks like this:

100000000000AAAAAAAAAAAAAAAAA....and so on to position 100
200000000000AAAAAAAAAAAAAAAAA....and so on to position 100
300000000000AAAAAAAAAAAAAAAAA....and so on to position 100

File B is NOT sorted on this key. File B looks like this:

200000000000BBBBBBBBBBBBBBBBB....and so on to position 100
100000000000BBBBBBBBBBBBBBBBB....and so on to position 100
111111111111BBBBBBBBBBBBBBBBB....and so on to position 100

What I want to do is join the two files based on this 12-digit key, and group by this 12-digit key, throwing out all records from file B that don't have a match in file A.

So, the resulting output file should look like this:

100000000000ABBSNDNNDJFJFFJFJ....and so on to position 100
200000000000lhjbdfllfblhdflhf....and so on to position 100
200000000000KJFSKJSFkfjsakjbn....and so on to position 100
200000000000KJFSKJSFkfjsakjbn....and so on to position 100
300000000000mfd3434MFMFN2323n....and so on to position 100

So, I want to keep the original order from File A, and group each record from File A together with each key match record on File B. I want to keep all records from File A even if it has no match in file B. But I want to throw out all records in File B that don't have a match in File A.

Anyone out there know how I can achieve this? Any help would be greatly appreciated! Thanks!
 
Brute force method:
awk '{print;system("grep \"^"substr($0,1,12)"\" fileB")}' fileA

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
Thanks for the suggestion, but this just seems to spit out File A. What I mean is, the resulting file (File C) is EXACTLY the same (byte-for-byte) as File A. I'm using this command:

awk '{print;system("grep \"^"substr($0,1,12)"\" fileB")}' fileA > fileC

What I'm looking for here is an output file that contains allrecords from File A, and also contains all of the matches on File B, grouped by the orignal order of File A. Does that make sense? I'm not sure, maybe I'm doing something wrong. Any other ideas?
 
mwesticle, works for me:
> cat fileA
100000000000AAAAAAAAAAAAAAAAA....and so on to position 100
200000000000AAAAAAAAAAAAAAAAA....and so on to position 100
300000000000AAAAAAAAAAAAAAAAA....and so on to position 100
> cat fileB
200000000000BBBBBBBBBBBBBBBBB....and so on to position 100
100000000000BBBBBBBBBBBBBBBBB....and so on to position 100
111111111111BBBBBBBBBBBBBBBBB....and so on to position 100
> awk '{print;system("grep \"^"substr($0,1,12)"\" fileB")}' fileA > fileC
> cat fileC
100000000000AAAAAAAAAAAAAAAAA....and so on to position 100
100000000000BBBBBBBBBBBBBBBBB....and so on to position 100
200000000000AAAAAAAAAAAAAAAAA....and so on to position 100
200000000000BBBBBBBBBBBBBBBBB....and so on to position 100
300000000000AAAAAAAAAAAAAAAAA....and so on to position 100
>

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top