Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Help comparing two files 2

Status
Not open for further replies.

Captainrave

Technical User
Nov 16, 2007
97
0
0
GB
Hi all,

I ran into a huge problem with my work today and think a Perl script will fix it and hopefully save me weeks of tedious work (and needless to say get the boss off my back!). Basically I am stuck comparing two files.

The first has two columns that look like:
3254 5041
8277 9047
9052 10056
10053 10427
10431 11270
11271 12011

The second has many columns that look like:
AAAAAA 443 526 511 516 513.5
TTTTT 629 1648 669 673 671
AAAAA 629 1648 694 698 696
AAAAA 629 1648 863 867 865
AAAAA 629 1648 894 898 896

If the number in the 1st column of the 1st file matches a number in the 2nd column of the 2nd file I want to save that entire line of the 2nd file (I assume to a new "outfile). So something like:

Code:
#!C:/Perl/bin/perl.exe -w

#Opening the CDS location file
print "Please type the filename of the CDS_LOCATION.xls file:";
$File1 = <STDIN>;
chomp $File1;

#Opening Comparison file
print "please type the filename of the comparison.csv file:";
$File2 = <STDIN>;
chomp $File2;

print "please type the filename to save the results to (.csv format !!important!!):";
$outfile = <STDIN>;
chomp $outfile;

open(CDSFILE,$File1);
open(COMPARISONFILE, $File2);
open(OUTFILE,">$outfile");

###########################
#      Output             #
###########################

while(<CDSFILE>){
  chomp;
  If CDSFile column 1 = COMPARISONFILE column 2 
  print that line of COMPARISONFILE to OUTFILE;
}

exit;

Problem is it has been a long while since I last wrote Perl script. Any help would be very much appreciated!!!
 
Perl:
use strict;
use warnings;

my %keys;

open (FILE1, "file1.txt") or die "Can't open FILE1";

while (<FILE1>) {
   my ($key, undef) = split;
   $keys{$key}++;
}

close(FILE1);

open (FILE2, "file2.txt") or die "Can't open FILE2";

while (<FILE2>) {
   my (undef, $key, undef) = split;
   print $_ if exists $keys{$key};
}

close (FILE2);
Not tested...

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
Thanks for the reply. I have been trying to implement that, but so far with little success. I will keep trying.
 
OK, I have changed my approach slightly (the mistake is bigger than I thought!). Firstly, both files are in CSV format.

The first now has three columns that look like:
3254 5041 -
8277 9047 -
9052 10056 +
10053 10427 -
10431 11270 +
11271 12011 +

The second has many columns that look like:
AAAAAA 443 526 511 516 513.5
TTTTT 629 1648 669 673 671
AAAAA 629 1648 694 698 696
AAAAA 629 1648 863 867 865
AAAAA 629 1648 894 898 896

What I am trying to do is this:

IF 3rd column in the 1st file is negative COMPARE the number in the 1st column of the 1st file to the number in the 2nd column of the second file.

IF all the above conditions are true I want it to go to column J in the second file (on the line it matched) and perform a calculation. I am still trying to work out the calculation!

ELSE, If there is no match on a line in the second file, I don't want the calculation performed, but do want the line kept and output!

So confusing! How would this even look in Perl? I really appreciate any help!
 
I don't think it makes much difference, but the calculation is just 100 minus the number in column J!
 
In the first file, is the first column always a unique value (3254 for example) or can it be repeated within the first column in the first file?

IF 3rd column in the 1st file is negative COMPARE the number in the 1st column of the 1st file to the number in the 2nd column of the second file.

Does the number in column one of the first file get compared to all the 2nd columns in the second file? Or just the corresponding line in the second file? Like line number 10 of file one is compared to line number 10 of second file, or what? And what do you mean by compare? Compare how? That they are equal in value or something different?

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
In the first file the number is always unique. In the second file the number can be repeated on multiple lines.

Yes, the number in the first file needs to get compared to all the lines in the second file.

Compare as in equal values.

When I say "do the calculation". I mean perform the calculation on column J for that line, and replace with the new value in J. If that makes sense?
 
Untested code. Change the split() delimter if the delimeter is not spaces:

Code:
use warnings;
use strict;
my %HASH1;

open (my $FILE1, "file1.txt") or die "Can't open FILE1: $!";

while (<$FILE1>) {
   chomp; 
   my ($key, $neg) = (split(/\s+/))[0,2];
   $HASH1{$key} = 1 if ($neg eq '-');
}

close $FILE1;

open (my $FILE2, "file2.txt") or die "Can't open FILE2: $!";

while (<$FILE2>) {
   chomp; 
   my @data = split(/\s+/);
   if ($HASH1{$data[1]} == 1) {
      # do something with column j of @data
   }
}

close $FILE2;

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Thanks Kevin. I am now having trouble with the minor technicalities of this script i.e. getting the line and calculation into the new file (I am calling it File 3). Not sure I have got the splits right either!

Code:
use warnings;
use strict;
my %HASH1;

open(OUTPUT,"+>File 3.csv");
chomp $FILE3;

open (my $FILE1, "File 1.xls") or die "Can't open FILE1: $!";

while (<$FILE1>) {
   chomp;
   my ($key, $neg) = (split(/\s+/))[0,2];
   $HASH1{$key} = 1 if ($neg eq '-');
}

close $FILE1;

open (my $FILE2, "File 2.csv") or die "Can't open FILE2: $!";

while (<$FILE2>) {
   chomp;
   my @data = split(/\,/);
   if ($HASH1{$data[1]} == 1) {
      calculate(100-$data[9])	
# put into column J and somehow replace the old value?
	  print OUTFILE $line       
# i.e. print line, but with the new column J calculation
   }
}else{
  print FILE3 $line; 
# If no match between line in file 1 and 2, print line unchanged
}

close $FILE2;
 
Revised version
Perl:
use warnings;
use strict;
my %keys;

open (FILE1, "file1.txt") or die "Can't open FILE1: $!";

while (<FILE1>) {
   chomp; 
   my ($keys, $neg) = (split(/\s+/))[0,2];
   $keys{$key}++ if ($neg eq '-');
}

close FILE1;

open (FILE2, "file2.txt") or die "Can't open FILE2: $!";
open (FILE3, ">file3.txt") or die "Can't open FILE3: $!";

while (<FILE2>) {
   chomp; 
   my @data = split(/\s+/);
   $data[9] = 100 - $data[9] if (exists $keys{$data[1]});
   print FILE3 join(' ', @data), "\n";
}

close FILE2;
close FILE3;
BTW, I don't want to seem ungrateful for the star and all, but have you ever considered writing any of the code yourself? This is supposed to be a help forum, not a free coding service...

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
I have modified it slightly to better suit my needs. I definitely understand how it works. However, I get the following error: "Global symbol "$key" requires explicit package name at ComplementDeletionV2.pl l
ine 10." What does that mean?

Code:
use warnings;
use strict;
my %keys;

open (FILE1, "File1.xls") or die "Can't open FILE1: $!";

while (<FILE1>) {
   chomp;
   my ($keys, $neg) = (split(/\s+/))[0,2];
   $keys{$key}++ if ($neg eq '-');
}

close FILE1;

open (FILE2, "File2.csv") or die "Can't open FILE2: $!";
open (FILE3, ">File3.csv") or die "Can't open FILE3: $!";

while (<FILE2>) {
   chomp;
   my @data = split(/\,/);
   $data[9] = 100 - $data[9] if (exists $keys{$data[1]});
   print FILE3 join(' ', @data), "\n";
}

close FILE2;
close FILE3;

I really need to script more. I have forgotten so much! I just don't need Perl much these days.
 
My bad
Code:
my ($key[red]s[/red], $neg) = (split(/\s+/))[0,2];
should be
Code:
my ($key, $neg) = (split(/\s+/))[0,2];
I just wrote it, I never said I'd tested it...[wink]

if the files are CSV, you will need to adjust all the split and join statements accordingly.

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
Have tested and it works like a charm! Sorry, I should have noticed that! Many thanks to everyone who helped :)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top