Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

comparing contents of two files (field comparison)

Status
Not open for further replies.

AwkRookie

Programmer
Oct 15, 2002
10
US
I want to write a script which will compare field(s) from two different files. The data in these fields is numeric and I want to identify the row number and data element where there is a match.

I have sketched out some pseudo-code which may help explain my question; my explanations are denoted by **

(1) Use the following:
awk '{print $<field# from file#1>}' file#1>f1

* put field # from file #1 into output file f1

awk '{print $<field# from file#2>}' file#2>f2

* put field # from file #2 into output file fw

awk 'compare_file_fields' f1 f2

* run the script below using files f1 and f2

BEGIN

for (a=1; a<=NF, a++)

x(a)={ print f1 $<field# from file#1>}
y(a)={ print f2 $<field# from file#2>}

* put field elements from file 1 and file 2 into variables x and y
{
for (i=1; i<=NF; i++)
{
for (j=1; j<=NF; j++)
{if (x(j)==y(i))
<print data element from x that matches y and also print the row number>


** check to see if any element from x matches any element from y and print out that element along with the row # in file 2 where the match occurred

}
}
}

What is the best way to accomplish this numeric comparison of field elements between two different files?
 
Just data samples for me please, and a before and after snapshot.
I can't wade through pcode ;)
 
marsd,

Sorry.
happy.gif


Here's my example:

file 1
------
contains fields file1_field1 thru file1_field4 with numeric data (double-precision) as:


file1_field1 file1_field2 file1_field3 file1_field4
16:00 10 20 30
16:10 40 50 60
16:20 12 23 14

file 2
------
contains fields file2_field1 thru file2_field4 with numeric data (double-precision) as:

file2_field1 file2_field2 file2_field3 file2_field4
16:00 10 22 31
16:10 20 50 60
16:20 14 29 13

Now, I want to see if row 1, field 3 from file #1 matches row 1, field 3 from file #2 OR row2, field 3 from file #2 OR rowN, field 3, from file #2 (where N is the number of rows in file 2). Once I check to see if row 1, field 3 from file #1 matches all of the corresponding elements in field 3 of file #2, I would index to row 2. Now, I check to see if row 2, field 3 from file #1 matches row 1, field 3 from file #2 OR row 2, field 3 from file #2, and so on.

The output should be:

row 1, field 2 match between files #1 and #2 at row number 1, field #2 (both have a value of 10)

row 2, field 2 match between files #1 and #2 at row number 2, field #2 (both have a value of 50)

and so on ...

output to screen would be:

value of 10 found in row 1
value of 50 found in row 2

Also, I would like to store the row#s and values of the solution in a matrix as follows; for each value in field #1 of:

16:00 10 1
16:10 50 2
16:20 no match

The size of the output file for this script should correspond to the size of file #2, i.e., for every row in file #2, there should either be a value & row number or the text &quot;no match&quot;.

Let me know if this is a sufficient input/output description of what I am trying to do.

AwkRookie
 
Gawk only. I don't work with old awk or nawk.

function compare(array,FLDVAL,REC,FLDNUM) {
#print &quot;AT&quot;, FLDVAL, &quot;NR=&quot;, REC, &quot;FIELDNUMBER=&quot;, FLDNUM
for (all in array) {
if (all == REC) {
v = split(array[all],loc)
if (loc[FLDNUM - 1] == FLDVAL) {
printf &quot;Matching value row %d, field = %d, fieldval = %d:\n&quot;, REC, FLDNUM ,FLDVAL
return &quot;1&quot;
}
}
}
return &quot;0&quot;
}


BEGIN {
x = 1;
while ((getline < &quot;k1.txt&quot;) > 0) {
arr1[x++] = $2&quot; &quot;$3&quot; &quot;$4
}
close(&quot;/home/k1.txt&quot;)
}

{
for (m=2 ; m <= NF ; m++) {
#print &quot;Field no:&quot;, m , &quot;value=&quot;, $m
compare(arr1,$m,NR,m)
}
}

Output with your data sample:
Matching value row 1, field = 2, fieldval = 10:
Matching value row 2, field = 3, fieldval = 50:
Matching value row 2, field = 4, fieldval = 60:
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top