Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

arrays and variables with pattern matching

Status
Not open for further replies.

stort

Programmer
Feb 17, 2006
10
GB
Right here's the problem:

i have one file (input), which contains keywords to be matched
and solution.java is the file i want to check through for the keywords.

so far i have written a program that will look at the first line (ie keyword) in input and loop through solution.java to see if it exists. the second line in input (ie 2nd keyword) is then used to search through everyline of solution.java to see if it exists. this continues for all lines in input.

so my program will print out "yes the", $keyword, " exists";
for every line it is found in solution.java.

this is great but a counter would be far more useful.
if i know how many lines there are in input it is easy to create $counter 1, $counter2 etc for each line. the problem comes as the number of lines of input can vary. So i need a way to create a separate counter for each line in input.

i hope i havent lost anybody and any help would be gratefully received

below is my current loop, HTH
#####################
# lines is solution.java
# lines2 is input
for ($a=0; $a<$#lines+1; $a++){
for ($b=0; $b<$#lines2+1; $b++){
if ($lines[$a] =~ /$lines2[$b]/) {
print $lines[$a];
print "\nyes ", $lines2[$b], " exists\n\n";
}
}
}
#####################

cheers Stort
 
I almost couldn't understand anything.
But you can check from CPAN Tie::File, I think it could help you.

Corwin
 
something like this?

Code:
[b]#!/usr/bin/perl[/b]

undef $/;
$_ = <DATA>;
$/ = "\n";

@keywords = qw( cat dog chicken );

foreach $keyword (@keywords) {
  s/\b($keyword)\b/$found{$1}++/eg;
}

while (($key, $value) = each %found) {
  print "$key => $value\n";
}

[blue]__DATA__
blah blah blah
blah cat blah blah
blah blah cat blah dog blah
blah blah blah cat blah
blah blah[/blue]

outputs:-

[red]dog => 1
cat => 3
[/red]

Kind Regards
Duncan
 
post a few lines of your solution.java file for further help
 
solution.java will vary, it is a control file that will change everytime the program needs to be run.

so, line one (only a single keyword will be on each line) of input is read in and used to search for all occurances of that word in solution.java

then line two of input, then line three of input etc etc

i hope that helps
 
well, it still would have be useful to see a few lines of the solution.java file to get an idea of the structure/formatting of the lines. Without knowing that, I will assume it's space delimited words on each line of the solution.java file and you are not looking for any wild card matching:

Code:
my @lines2 = qw(cat dog fish horse pig);
my %keywords = map {$_ => 0} @lines2;#hashify the keywords

open(JAVA,'<solution.java') or die "$!";
while(my $line = <JAVA>) {
   chomp($line);
   foreach my $word (split(/\s+/,$line)) {
      $keywords{$word}++;
   }
}
close(JAVA);
print "Keyword\tCount\n";
print "---------------\n";
print "$_\t$keywords{$_}\n" for (sort {$keywords{$a} <=> $keywords{$b}} keys %keywords);
 
the first few lines of solution.java look like this:
####################################

public static final int FAIL = 0;
public static final int PASS = 1;
public static final int DISTINCTION = 2;
public static final int[] CATEGORIES = {50,80,100};
/**
* Integer array for storing input numbers
*/
private int[] grades;
/**
* Integer array for storing category number
*/
private int[] cats = new int[3];
/**
* score average
*/
private double avg
####################################

your solution seems to work fine, but i'm trying to understand it so i can manipulate it. could you please comment you code. also i am only really interested in the number of times the values of the lines2 array occur, not every string. but if i know what's going on i can hopefully pick the rest up myself.

cheers
 
Code:
[COLOR=green]#Just some fake keywords[/color]
my @lines2 = qw(cat dog fish horse pig);
[COLOR=green]#here we build a hash where the keywords are
#the keys and give them an initial value of
#zero (0) for each word[/color]
my %keywords = map {$_ => 0} @lines2;#hashify the keywords

[COLOR=green]open and read the java file[/color]
open(JAVA,'<solution.java') or die "$!";
while(my $line = <JAVA>) {
   [COLOR=green]#remove the newline if any[/color]
   chomp($line);
   [COLOR=green]#here each line is split into
   #seperate "words" using spaces as the
   #split pattern[/color]   
   foreach my $word (split(/\s+/,$line)) {
      [COLOR=green]#this could maybe be done better but it increments
      #the value of each word by one[/color]
      $keywords{$word}++;
   }
}
[COLOR=green]#close file and print results sorted lowest to highest by frequency[/color]
close(JAVA);
print "Keyword\tCount\n";
print "---------------\n";
print "$_\t$keywords{$_}\n" for (sort {$keywords{$a} <=> $keywords{$b}} keys %keywords);

might want to change this line:

$keywords{$word}++;

to:

$keywords{$word}++ if (exists $keywords{$word});

this way you should only get words that were in the original array of keywords. If you only want the total number of matches instead of the total per keyword:

Code:
my @lines2 = qw(cat dog fish horse pig);
my %keywords = map {$_ => 0} @lines2;#hashify the keywords
my $total = 0;
open(JAVA,'<solution.java') or die "$!";
while(my $line = <JAVA>) {
   chomp($line);
   foreach my $word (split(/\s+/,$line)) {
      $total++ if (exists $keywords{$word});
   }
}
close(JAVA);
print "$total total matches.";
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top