Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

File Comparison

Status
Not open for further replies.

rogers42

Technical User
Mar 30, 2007
64
CA
Hi,

I am trying to compare two files so I can identify the common, unique-to-file1, and unique-to-file2 records. Details are as follows:

file1.in
0
1
2
===============
file2.in
A
B
C
=================

The awk code is as follows

BEGIN { file2 = ARGV[2]; }

{
mac_1 = $0
print mac_1
while ( ( getline mac_2 < file2 ) > 0)
{
print mac_2;
}
print "============"
}

END { close("file2.in")}

============================

Execution:

nawk -f compare_multiple_files.awk file1.in file2.in

===============================
Output:

In my mind, for every one iteration of the outer loop, the utility should go through the entire inner loop. However the reality is different. Output is as follows

0
A
B
C
============
1 <--------- Why did we not iterate through all of the inner loop ???
============
2
============
A
============
B
============
C
============

Any debugging help or a new (effecient) approach will be appreciated.

Thanks in advance.

roger42



 
Why did we not iterate through all of the inner loop
You did but hit EOF as you didn't reopen the file ...

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Hi,

I have made the following changes to the code, but the output has three extra iterations

BEGIN { file2 = ARGV[2]; }

{
mac_1 = $0
print mac_1
while ( ( getline mac_2 < file2 ) > 0)
{
print mac_2;
}
close("file2.in")[/color red]
print "============"
}

Output is as follows:

nawk -f compare_multiple_files.awk file1.in file2.in
0
A
B
C
============
1
A
B
C
============
2
A
B
C
============
A <----- Everything beyond this point is extra. Is it because I have specified two input files at the command line ??? If so, then how can I prevent the utility from processing the two files sequentially ??? [/color red]
A
B
C
============
B
A
B
C
============
C
A
B
C
============


 
BEGIN { file2 = ARGV[2]; }
[!]NR!=FNR{exit}[/!]
{
mac_1 = $0
...

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Hi,

I think I am over the syntax errors but am struggling with the symantics of the utility.

The idea here is to compare the two files and spit out the unique data of the first file.

Input File 1:
=============
1
2
3
4

Input File 2:
=============
A
B
C

Utility looks as follows

BEGIN {
file2 = ARGV[2];

# Cleanup the existing out files
system("rm common.out");
system ("rm diff.out");
}

# The following will prevent the second file from being processed via outer loop
NR!=FNR{exit}

{
# Extract MAC address
mac_1 = $0;

while ( ( getline line_2 < file2 ) > 0)
{

# Get MAC address from second file
mac_2 = line_2;

print "Comparing: "
print "MAC 1: "
print mac_1;
print "MAC 2: "
print mac_2;
if (mac_1 == mac_2)
{
print mac_1 > "common.out"
print "Common: ";
print mac_1;
break;
}

else
{
print mac_1 > "diff.out"
print "Diff: ";
print mac_1;
}

} # Inner while loop

close("file2.in")
print "============"
}

Output:

0
A
Comparing:
MAC 1:
0
MAC 2:
A
Diff:
0
B
Comparing:
MAC 1:
0
MAC 2:
B
Diff:
0
C
Comparing:
MAC 1:
0
MAC 2:
C
Diff:
0
============
1
============
2
============
3
============
4
============

I would like the diffs to be written to "diff.out" file only once.
Why does the comparison finish at "C" while we still have more digits in the first file.

Thanks

rogers42

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top