Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

compare more than one file in AWK

Status
Not open for further replies.

unixb1234

Programmer
Dec 21, 2008
8
US
Hello everyone, I had posted this similar kind of question yesterday night with the different handle name but due sto some problems,I cannot logon
to my account nor can i see the reply sent for my question. Hence, I created a new handle and posting this question.

I appreciate any help regarding this. Thansk again, for the reply thought but I am unable to see it.


source file: This column value needs to be considered as a single field "12 34" (with space in between and this needs to
be compared with target 1 and 2 files with the filed values starting with 3rd position and ending in position 10 ..counting position from
the start of the position1)
12 34
56 78
99 88


target_file 1:
0112 34 this is 01 record
0312 34 this is 03 record
0156 78 this is 01 record
0356 78 this is 03 record

target_file 2:
0199 88 this is 01 record
0399 88 this is 03 record
0111 22 this is 01 record
0333 44 this is 03 record


Here I should get the outout file as:
=====================================

0112 34 this is 01 record- these fisrt four lines from target1
0312 34 this is 03 record
0156 78 this is 01 record
0356 78 this is 03 record
0199 88 this is 01 record - this is from target2
0399 88 this is 03 record - this is from target2


we need the output data from target files only for those matching the columns in the source file.

Also, the finalized out - can it be redirected to the file ?

Thanks very much again,
Bhanu
 
Assuming the files` format is as shown above, you can work on something like this:

Code:
awk  '
         FILENAME==ARGV[1] { a[$0] }
         FILENAME==ARGV[2] { split(substr($0,3,7),b); if ( b[1] FS b[2]  in a ) print }
         FILENAME==ARGV[3] { split(substr($0,3,7),b); if ( b[1] FS b[2]  in a ) print }
     '   source_file target_file1 target_file2 > output_file
 
thanks veryyy much, Moonring. its great..it works just great. based on ur awk script, I was able to test for more than 2 target files too and its till works(since I have so many files to comare the sourec file against).

my queston is: the actual data looks like below. (i simplified above to easy reference and when i tried for the actual data format..it is not working..so just thought of pasting the actual data format and the awak script I modified too. Can you please let me know what I am doing wrong ? I appreciate you time very much. I was thinking of so many other options which were so time consuming ..like oading into table and comparinga nd then converting to flatfile..but this is so great. thx again :)


***********************************************************

Here we need to compare the field of 40 characters starting from position 1 to position 40 in source file and the field of 40 characters starting from position 3 to the length of 40 characters starting from position 3.


source.txt
=========
AA INP 11111111



target1.txt
===========
01AA INP 11111111 UBBBBBBBBBBBBBBBBBBBBBBBBBB50 211083569 name12 name12 EF19410112214389912A

03AA INP 11111111 200710ZZZZZZZZZZZZZZZZZZZ11Z 4168 25052 78659 2724 42650 3051 36201 30503 8856 8853 3723
04 E




target2.txt
============
01AA INP 11111111 UBBBBBBBBBBBBBBBBBBBBBBBBBB50 211083569 name12 name12 EF19410112214389912A

03AA INP 11111111 200710ZZZZZZZZZZZZZZZZZZZ11Z 4168 25052 78659 2724 42650 3051 36201 30503 8856 8853 3723
04 E
01AA INP 22222222 UBBBBBBBBBBBBBBBBBBBBBBBBBB50 211083569 name12 name12 EF19410112214389912A

03AA INP 22222222 200710ZZZZZZZZZZZZZZZZZZZ11Z 4168 25052 78659 2724 42650 3051 36201 30503 8856 8853 3723
04 E




The awk script which i modified from ur original awk script is belw:
awk '
FILENAME==ARGV[1] { a[$0] }
FILENAME==ARGV[2] { split(substr($0,3,39),b); if ( b[1] FS b[2] in a ) print }
FILENAME==ARGV[3] { split(substr($0,3,39),b); if ( b[1] FS b[2] in a ) print }
' s1.txt t1.txt t2.txt > output_file

and the outputfile id zero bytes..do not know what the mistake I did.

Sorry for the trouble again :)

Thansk so very much, in advance.

Regards,
Bhanu
 
hello again, Moonring, I forgot to put the output for the above actual data format..this should be the output format. also, target2.txt doe snot contain this record "AA INP 11111111" sorry for the confusion.

the output needs to be:

01AA INP 11111111 UBBBBBBBBBBBBBBBBBBBBBBBBBB50 211083569 name12 name12 EF19410112214389912A

03AA INP 11111111 200710ZZZZZZZZZZZZZZZZZZZ11Z 4168 25052 78659 2724 42650 3051 36201 30503 8856 8853 3723
04 E



since sourcefile = AA INP 11111111 and target1.txt has "AA INP 11111111" starting from 3rd position for the length of 40 characters starting with position3.

Thansk so much once again :)
Regards,
Bhanu
 
The following code works for the given samples, modify as needed.

Code:
awk  '
         FILENAME==ARGV[1] { a[substr($0,1,40)] }
         FILENAME==ARGV[2] { if( substr($0,3,40) in a ) print }
         FILENAME==ARGV[3] { if( substr($0,3,40) in a ) print }
     '  s1.txt t1.txt t2.txt > output_file
 
Hello again, Moonring, I very much appreciate your message again. Thanks so much and I am able to use this awk script for my files. Thanks again and Thansk to this website also for all this help.
Regards,
Bhanu
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top