comparing 2 files

pearlofperls · Dec 17, 2009

i have 2 files and i want to compare
i currently cat the files and awk print $1, $2 and doing if file1=file2 then fail, else exit 0
what i want to do is compare values, with column 1 being a reference i want to compare line by line and then do:
the values not matching is a success
the values matching is failure

like a vlookup in excel

file 1

scott 1.4
suzy 3.5
josh 5.3
amanda 4.2
luke 2.1
file 2

file2

scott 4.1
suzy 3.5
josh 5.3
amanda 2.4
luke 1.2

Annihilannic · Dec 17, 2009

I would use awk for this too, but since you posted in the Perl forum:

Code:

[gray]#!/usr/bin/perl -w[/gray]
[url=http://perldoc.perl.org/functions/use.html][black][b]use[/b][/black][/url] [green]strict[/green][red];[/red]

[url=http://perldoc.perl.org/functions/open.html][black][b]open[/b][/black][/url] FILE1,[red]"[/red][purple]file1[/purple][red]"[/red] or [url=http://perldoc.perl.org/functions/die.html][black][b]die[/b][/black][/url][red];[/red]
[black][b]open[/b][/black] FILE2,[red]"[/red][purple]file2[/purple][red]"[/red] or [black][b]die[/b][/black][red];[/red]

[url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]%file1[/blue][red];[/red]

[olive][b]while[/b][/olive] [red]([/red]<FILE1>[red])[/red] [red]{[/red]
        [black][b]my[/b][/black] [blue]@a[/blue]=[url=http://perldoc.perl.org/functions/split.html][black][b]split[/b][/black][/url][red];[/red]
        [blue]$file1[/blue][red]{[/red][blue]$a[/blue][red][[/red][fuchsia]0[/fuchsia][red]][/red][red]}[/red] = [blue]$a[/blue][red][[/red][fuchsia]1[/fuchsia][red]][/red][red];[/red]
[red]}[/red]

[olive][b]while[/b][/olive] [red]([/red]<FILE2>[red])[/red] [red]{[/red]
        [black][b]my[/b][/black] [blue]@a[/blue]=[black][b]split[/b][/black][red];[/red]
        [olive][b]if[/b][/olive] [red]([/red][blue]$file1[/blue][red]{[/red][blue]$a[/blue][red][[/red][fuchsia]0[/fuchsia][red]][/red][red]}[/red] == [blue]$a[/blue][red][[/red][fuchsia]1[/fuchsia][red]][/red][red])[/red] [red]{[/red]
                [url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple][blue]$a[/blue][0] is a success[purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
        [red]}[/red] [olive][b]else[/b][/olive] [red]{[/red]
                [black][b]print[/b][/black] [red]"[/red][purple][blue]$a[/blue][0] is a failure[purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
        [red]}[/red]
[red]}[/red]

This does no error checking for missing values or anything like that...

Annihilannic.

pearlofperls · Dec 30, 2009

thx

curious how this would be done in awk and the pros/cons?

feherke · Dec 31, 2009

Hi

pearlofperls said:
curious how this would be done in awk and the pros/cons?

This is a rewrite of Annihilannic's code :

Code:

[gray]#!/usr/bin/awk -f[/gray]

[red]BEGIN[/red] [teal]{[/teal]

  [b]while[/b] [teal]([/teal][COLOR=chocolate]getline[/color] s [teal]<[/teal][green][i]"file1"[/i][/green][teal])[/teal] [teal]{[/teal]
    [COLOR=chocolate]split[/color][teal]([/teal]s[teal],[/teal]a[teal])[/teal]
    file1[teal][[/teal]a[teal][[/teal][purple]1[/purple][teal]]][/teal] [teal]=[/teal] a[teal][[/teal][purple]2[/purple][teal]][/teal]
  [teal]}[/teal]

  [b]while[/b] [teal]([/teal][COLOR=chocolate]getline[/color] s [teal]<[/teal] [green][i]"file2"[/i][/green][teal])[/teal] [teal]{[/teal]
    [COLOR=chocolate]split[/color][teal]([/teal]s[teal],[/teal]a[teal])[/teal]
    [b]if[/b] [teal]([/teal]file1[teal][[/teal]a[teal][[/teal][purple]1[/purple][teal]]][/teal] [teal]==[/teal] a[teal][[/teal][purple]2[/purple][teal]])[/teal] [teal]{[/teal]
      [COLOR=chocolate]print[/color] a[teal][[/teal][purple]1[/purple][teal]],[/teal][green][i]"is a success"[/i][/green]
    [teal]}[/teal] [b]else[/b] [teal]{[/teal]
      [COLOR=chocolate]print[/color] a[teal][[/teal][purple]1[/purple][teal]],[/teal][green][i]"is a failure"[/i][/green]
    [teal]}[/teal]
  [teal]}[/teal]

[teal]}[/teal]

However we usually prefer to use [tt]awk[/tt] from the command line, like this :

Code:

awk 'FNR==NR{f[$1]=$2;next}{print $1,"is a",$2==f[$1]?"success":"failure"}' file1 file2

By the way, in my case that is valid for [tt]perl[/tt] too :

Code:

perl -nae 'chomp;if($#ARGV==0){$f{$F[0]}=$F[1]}else{print$F[0]," is a ",$f{$F[0]}eq$F[1]?"success":"failure","\n"}' file1 file2

Regarding the pros/cons, as you can see, there are no big differences. Probably [tt]perl[/tt] is always faster and [tt]perl[/tt] code is more portable ( the various [tt]awk[/tt] implementations are not fully compatible with each other ). I would say, the simple tools are more suitable for the simple tasks.

Feherke.

http://free.rootshell.be/~feherke/

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

comparing 2 files

pearlofperls

Technical User

Annihilannic

MIS

pearlofperls

Technical User

feherke

Programmer

Similar threads

Part and Inventory Search

Sponsor