File Comparison

rogers42 · May 23, 2007

Hi,

I am trying to compare two files so I can identify the common, unique-to-file1, and unique-to-file2 records. Details are as follows:

file1.in
0
1
2
===============
file2.in
A
B
C
=================

The awk code is as follows

BEGIN { file2 = ARGV[2]; }

{
mac_1 = $0
print mac_1
while ( ( getline mac_2 < file2 ) > 0)
{
print mac_2;
}
print "============"
}

END { close("file2.in")}

============================

Execution:

nawk -f compare_multiple_files.awk file1.in file2.in

===============================
Output:

In my mind, for every one iteration of the outer loop, the utility should go through the entire inner loop. However the reality is different. Output is as follows

0
A
B
C
============
1 <--------- Why did we not iterate through all of the inner loop ???
============
2
============
A
============
B
============
C
============

Any debugging help or a new (effecient) approach will be appreciated.

Thanks in advance.

roger42

PHV · May 23, 2007

Why did we not iterate through all of the inner loop
You did but hit EOF as you didn't reopen the file ...

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886

feherke · May 23, 2007

Hi

Between two iterations you have to [tt]close[/tt] the file. Is not enough to do it in the [tt]END[/tt] block.

Feherke.

http://rootshell.be/~feherke/

rogers42 · May 23, 2007

Hi,

I have made the following changes to the code, but the output has three extra iterations

BEGIN { file2 = ARGV[2]; }

{
mac_1 = $0
print mac_1
while ( ( getline mac_2 < file2 ) > 0)
{
print mac_2;
}
close("file2.in")[/color red]
print "============"
}

Output is as follows:

nawk -f compare_multiple_files.awk file1.in file2.in
0
A
B
C
============
1
A
B
C
============
2
A
B
C
============
A <----- Everything beyond this point is extra. Is it because I have specified two input files at the command line ??? If so, then how can I prevent the utility from processing the two files sequentially ??? [/color red]
A
B
C
============
B
A
B
C
============
C
A
B
C
============

PHV · May 23, 2007

BEGIN { file2 = ARGV[2]; }
[!]NR!=FNR{exit}[/!]
{
mac_1 = $0
...

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886

rogers42 · May 25, 2007

Worked like a charm.

Thanks

rogers42

rogers42 · May 29, 2007

Hi,

I think I am over the syntax errors but am struggling with the symantics of the utility.

The idea here is to compare the two files and spit out the unique data of the first file.

Input File 1:
=============
1
2
3
4

Input File 2:
=============
A
B
C

Utility looks as follows

BEGIN {
file2 = ARGV[2];

# Cleanup the existing out files
system("rm common.out");
system ("rm diff.out");
}

# The following will prevent the second file from being processed via outer loop
NR!=FNR{exit}

{
# Extract MAC address
mac_1 = $0;

while ( ( getline line_2 < file2 ) > 0)
{

# Get MAC address from second file
mac_2 = line_2;

print "Comparing: "
print "MAC 1: "
print mac_1;
print "MAC 2: "
print mac_2;
if (mac_1 == mac_2)
{
print mac_1 > "common.out"
print "Common: ";
print mac_1;
break;
}

else
{
print mac_1 > "diff.out"
print "Diff: ";
print mac_1;
}

} # Inner while loop

close("file2.in")
print "============"
}

Output:

0
A
Comparing:
MAC 1:
0
MAC 2:
A
Diff:
0
B
Comparing:
MAC 1:
0
MAC 2:
B
Diff:
0
C
Comparing:
MAC 1:
0
MAC 2:
C
Diff:
0
============
1
============
2
============
3
============
4
============

I would like the diffs to be written to "diff.out" file only once.
Why does the comparison finish at "C" while we still have more digits in the first file.

Thanks

rogers42

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

File Comparison

rogers42

Technical User

PHV

MIS

feherke

Programmer

rogers42

Technical User

PHV

MIS

rogers42

Technical User

rogers42

Technical User

Similar threads

Part and Inventory Search

Sponsor