compare more than one file in AWK

unixb1234 · Dec 21, 2008

Hello everyone, I had posted this similar kind of question yesterday night with the different handle name but due sto some problems,I cannot logon
to my account nor can i see the reply sent for my question. Hence, I created a new handle and posting this question.

I appreciate any help regarding this. Thansk again, for the reply thought but I am unable to see it.

source file: This column value needs to be considered as a single field "12 34" (with space in between and this needs to
be compared with target 1 and 2 files with the filed values starting with 3rd position and ending in position 10 ..counting position from
the start of the position1)
12 34
56 78
99 88

target_file 1:
0112 34 this is 01 record
0312 34 this is 03 record
0156 78 this is 01 record
0356 78 this is 03 record

target_file 2:
0199 88 this is 01 record
0399 88 this is 03 record
0111 22 this is 01 record
0333 44 this is 03 record

Here I should get the outout file as:
=====================================

0112 34 this is 01 record- these fisrt four lines from target1
0312 34 this is 03 record
0156 78 this is 01 record
0356 78 this is 03 record
0199 88 this is 01 record - this is from target2
0399 88 this is 03 record - this is from target2

we need the output data from target files only for those matching the columns in the source file.

Also, the finalized out - can it be redirected to the file ?

Thanks very much again,
Bhanu

moonring · Dec 21, 2008

Assuming the files` format is as shown above, you can work on something like this:

Code:

awk  '
         FILENAME==ARGV[1] { a[$0] }
         FILENAME==ARGV[2] { split(substr($0,3,7),b); if ( b[1] FS b[2]  in a ) print }
         FILENAME==ARGV[3] { split(substr($0,3,7),b); if ( b[1] FS b[2]  in a ) print }
     '   source_file target_file1 target_file2 > output_file

unixb1234 · Dec 21, 2008

thanks veryyy much, Moonring. its great..it works just great. based on ur awk script, I was able to test for more than 2 target files too and its till works(since I have so many files to comare the sourec file against).

my queston is: the actual data looks like below. (i simplified above to easy reference and when i tried for the actual data format..it is not working..so just thought of pasting the actual data format and the awak script I modified too. Can you please let me know what I am doing wrong ? I appreciate you time very much. I was thinking of so many other options which were so time consuming ..like oading into table and comparinga nd then converting to flatfile..but this is so great. thx again

***********************************************************

Here we need to compare the field of 40 characters starting from position 1 to position 40 in source file and the field of 40 characters starting from position 3 to the length of 40 characters starting from position 3.

source.txt
=========
AA INP 11111111

target1.txt
===========
01AA INP 11111111 UBBBBBBBBBBBBBBBBBBBBBBBBBB50 211083569 name12 name12 EF19410112214389912A

03AA INP 11111111 200710ZZZZZZZZZZZZZZZZZZZ11Z 4168 25052 78659 2724 42650 3051 36201 30503 8856 8853 3723
04 E

target2.txt
============
01AA INP 11111111 UBBBBBBBBBBBBBBBBBBBBBBBBBB50 211083569 name12 name12 EF19410112214389912A

03AA INP 11111111 200710ZZZZZZZZZZZZZZZZZZZ11Z 4168 25052 78659 2724 42650 3051 36201 30503 8856 8853 3723
04 E
01AA INP 22222222 UBBBBBBBBBBBBBBBBBBBBBBBBBB50 211083569 name12 name12 EF19410112214389912A

03AA INP 22222222 200710ZZZZZZZZZZZZZZZZZZZ11Z 4168 25052 78659 2724 42650 3051 36201 30503 8856 8853 3723
04 E

The awk script which i modified from ur original awk script is belw:
awk '
FILENAME==ARGV[1] { a[$0] }
FILENAME==ARGV[2] { split(substr($0,3,39),b); if ( b[1] FS b[2] in a ) print }
FILENAME==ARGV[3] { split(substr($0,3,39),b); if ( b[1] FS b[2] in a ) print }
' s1.txt t1.txt t2.txt > output_file

and the outputfile id zero bytes..do not know what the mistake I did.

Sorry for the trouble again

Thansk so very much, in advance.

Regards,
Bhanu

unixb1234 · Dec 21, 2008

hello again, Moonring, I forgot to put the output for the above actual data format..this should be the output format. also, target2.txt doe snot contain this record "AA INP 11111111" sorry for the confusion.

the output needs to be:

01AA INP 11111111 UBBBBBBBBBBBBBBBBBBBBBBBBBB50 211083569 name12 name12 EF19410112214389912A

03AA INP 11111111 200710ZZZZZZZZZZZZZZZZZZZ11Z 4168 25052 78659 2724 42650 3051 36201 30503 8856 8853 3723
04 E

since sourcefile = AA INP 11111111 and target1.txt has "AA INP 11111111" starting from 3rd position for the length of 40 characters starting with position3.

Thansk so much once again

Regards,
Bhanu

moonring · Dec 21, 2008

The following code works for the given samples, modify as needed.

Code:

awk  '
         FILENAME==ARGV[1] { a[substr($0,1,40)] }
         FILENAME==ARGV[2] { if( substr($0,3,40) in a ) print }
         FILENAME==ARGV[3] { if( substr($0,3,40) in a ) print }
     '  s1.txt t1.txt t2.txt > output_file

unixb1234 · Dec 21, 2008

Hello again, Moonring, I very much appreciate your message again. Thanks so much and I am able to use this awk script for my files. Thanks again and Thansk to this website also for all this help.
Regards,
Bhanu

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

compare more than one file in AWK

unixb1234

Programmer

moonring

Technical User

unixb1234

Programmer

unixb1234

Programmer

moonring

Technical User

unixb1234

Programmer

Similar threads

Part and Inventory Search

Sponsor