Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

understanding check between two files 3

Status
Not open for further replies.

LAdProg2005

Programmer
Feb 20, 2006
56
0
0
US
I am using simple script to check difference in two files as bellow

Code:
#!/usr/bin/perl
open a, "$ARGV[0]";
open b, "$ARGV[1]";
local $/; my @a = split /\n/, <a>;
my @b = split /\n/, <b>;
my %b = map { $_ => 1 } @b;
# Make hash of B
my @res = grep { !defined $b{$_} } @a;
# Everything in A not in B
print join "\n", @res; print "\n";

File 1:
apple|1
banana|1
kiwi|3
File 2:
banana|2
kiwi|3

when running result is
apple|1 - removed in file 2
banana|1 - changed in file 2

by looking at the records you can't tell if the line was updated or deleted....

how do i differeciate between which record is deleted and which is changed? i can tell with the above data because it is couple lines , but for many lines it would not be good to eyeball...

thanks,
LAd
 
If this is any unix-flavored system (including OS X) you'll probably have a `diff` command... you can just use that.

Code:
diff -bu file1.txt file2.txt

It will tell you the differences between the files. The general usage is:

Code:
diff [options] <original file> <new file>

So if you were comparing source codes of two files, where file 1 is the original/old version and file 2 is a newer version, it would show + marks on lines that were added and - marks on lines that were deleted.

Kirsle.net | My personal homepage
Code:
perl -e '$|=$i=1;print" oo\n<|>\n_|_";x:sleep$|;print"\b",$i++%2?"/":"_";goto x;'
 
well, the process will be automated as the files need to checked often hence they need to be put in script....

 
You can automate it via cron, or just make your perl script run `diff` via a system call.

Code:
my $diff_output = `diff -bu file1 file2`;
print $diff_output;

Kirsle.net | My personal homepage
Code:
perl -e '$|=$i=1;print" oo\n<|>\n_|_";x:sleep$|;print"\b",$i++%2?"/":"_";goto x;'
 
Ok, so with that logic,

how do i process such
that i can say

if removed line do onething
if new line do secondthing
if updated line do thirdthing
 
You should better specify your goal and the structure of your files. Do you want to execute a command for every record that's not the same in both? Or do you want to create a 3rd file with everything that's in both? Or one of the two is a master file and the second one should be modified with the values of the master? Or...
If both files are a collection of key-value pairs as in your example, one could create a single hash containing as values an array with the two values from each file; with this you can do what you want (but please be clear on what you want!).
Note BTW that the existence of a key in a hash should not be tested with [tt]defined[/tt], but with [tt]exists[/tt]

Franco
: Online engineering calculations
: Magnetic brakes for fun rides
: Air bearing pads
 
You should better specify your goal and the structure of your files.
My goal is to compare two files and difference between records. file 1 is master and file 2 is to update the records. file 2 suggests that record is updated (record needs to be updated from master), removed (record is not needed hence not in file 2) or new record is added which means record doesn't exist in master file.

(but please be clear on what you want!).
Ok I try again to explain please
Bascially, i have two files with records. file 1 is older and file2 is newer/updated. (records doesn't have to be lenght of two though it could be three or four...with unique key of one column(in this case it is first column)

File 1:
apple|1
banana|1
kiwi|3

File 2:
banana|2
kiwi|3
strawberry|1

file 3: or local var that can be accessed later
->create a third file or logic that stores info
banana|2 was updated
kiwi|3 was ignored no changes to it
strawberry|1 was inserted
apple|1 was deleted
 
If you have the GNU diff available, you can use format specifiers like this:

Code:
diff --old-line-format="%l was deleted%c'\012'" --new-line-format="%l was inserted%c'\012'" --unchanged-line-format="%l was ignored no changes to it%c'\012'" file1 file2

# or like this, if you prefer the format
diff --old-line-format="%l was deleted
" --new-line-format="%l was inserted
" --unchanged-line-format="%l was ignored no changes to it
" file1 file2

Annihilannic.
 
I'm not sure why you do not want to use the diff function. Anyway, here's a crude way of doing what you want.
Code:
#!/usr/bin/perl

use strict;

my $file_1 =<<file_a;
apple|1
banana|1
kiwi|3
file_a

my $file_2 =<<file_b;
banana|2
kiwi|3
strawberry|1
file_b

my %record_1;
open( my $file, '<', \$file_1 ) or die "Error: Cannot open $file_1\n";
while (<$file>) {
    chomp;
    my ($key, $val) = split( /\|/ );
    $record_1{$key} = $val;
}
close( $file );

my %record_2;
open( my $file, '<', \$file_2 ) or die "Error: Cannot open $file_2\n";
while (my $line = <$file>) {
    chomp $line;
    my ($key, $val) = split( /\|/, $line );
    my $delete_ok;

    $record_2{$key} = $val;
    for my $i ( keys %record_1 ) {
        if ( $i eq $key && $record_1{$i} == $val ) {
            print "$line has not changed.\n";
            $delete_ok++;
        }
        if ( $i eq $key && $record_1{$i} != $val ) {
            print "$line was updated.\n";
            $delete_ok++;
        }
        if ( $delete_ok ) {
            delete $record_1{$i};
            delete $record_2{$i};

            $delete_ok = 0;
        }
    }

}
close( $file );

for my $i (keys %record_2) {
    grep { /\b$i\b/ } keys %record_1
      ? print "$i|$record_2{$i} was inserted.\n"
      : '';
}

for my $i (keys %record_1) {
    grep { /\b$i\b/ } keys %record_2
      ? print "$i|$record_1{$i} was deleted.\n"
      : '';
}
 
Well...it seems to me that your master is file2, as you are updating file1 with the information contained in file2.
Anyway the logic I would use is as proposed by bichonfrise74 (and by me above), except that it can be simplified as follows:
Code:
use strict;
my$file_1 =<<file_a;
apple|1
banana|1
kiwi|3
file_a

my$file_2 =<<file_b;
banana|2
kiwi|3
strawberry|1
file_b

my(%result,$key,$val,$file);
open($file,'<',\$file_2)or die"Error: Cannot open $file_2\n";
while(<$file>){
  chomp;
  ($key,$val)=split(/\|/);
  $result{$key}=[$val,'was inserted'];
}
close($file);

open($file,'<',\$file_1)or die"Error: Cannot open $file_1\n";
while(<$file>){
  chomp;
  ($key,$val)=split(/\|/);
  if(exists$result{$key}){
    if($val==$result{$key}[0]){
      $result{$key}[1]='was ignored no changes to it';
    }else{
      $result{$key}[1]='was updated';
    }
  }else{
    $result{$key}=[$val,'was deleted'];
  }
}
for(sort keys%result){
  print"$_|@{$result{$_}}\n"; 
}

Franco
: Online engineering calculations
: Magnetic brakes for fun rides
: Air bearing pads
 
Thank you both. Works very well. THanks for being patient with me.

Just one quick question...The logic wouldn't work if data was? What would I have to change to take that into consideration? I am thinking ($key,$val)=split(/\|/); would need another variable but comparison logic will have to be updated or no?

my$file_1 =<<file_a;
apple|1|s
banana|1|s
kiwi|3|s
file_a

my$file_2 =<<file_b;
banana|2|s
kiwi|3|b
strawberry|1|s
file_b

Thanks.
 
also,thanks to all who made suggestions. best way to learn to look at multiple different ways of doing same thing. just that i am new and it takes me longer to figure out what is better way....

thanks again!!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top