Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

input from multiple text files

Status
Not open for further replies.

7Star

Technical User
Nov 17, 2005
13
SE
Hi,

I try to make a perl program that count words from text files and from parsed html files. Right now I just started with text files and try to count the words in them. If I specify multiples text files as arguments to my program the program count all words from all files and put them in an hash. The problem I have is that I want to be able to get statistics for each text file also if the -s option is given. For example if i run my program with: ./program -s test.txt test2.txt

Then I want to count all words in test.txt and test2.txt separately. I'm quite new to perl so any help would be appreciated a lot.

Here is my code:
Code:
#!/usr/local/bin/perl
use warnings;
use Getopt::Std;
use LWP::Simple;
use HTML::Parser;

getopts('is');

my @url = ();
my $pattern="[URL unfurl="true"]http://";[/URL]
foreach(@ARGV) {
	if($_=~ m/$pattern/) {
		push(@url,$_);
	}
	else {
		push(@textfile,$_);
	}
}

if(defined $opt_i){
	foreach (@textfile) {
		open(FILE,$_);
		@text=<FILE>;
		foreach (@text) {
			@word = split(/ /,$_);				 
		}
	foreach $word (@word) {
		 $stat{$word}++;
	}	
	}
	foreach $stat(%stat){
	print "$stat\n"; }
}
elsif(defined $opt_s) {
	"make statistics for every text file by them self"
}
 
Try this:

Code:
my %wordCntPerFile;
if(defined $opt_i)
{
  foreach (@textfile) 
  {
    open(FILE,$_);
    @text=<FILE>;
    foreach (@text)
    {
      @word = split(/ /,$_);                
    }
    $wordCntPerFile{$_} = $#word+1;
    close(FILE);
  }
}

Not test it, but should give you some idea.
 
Thanks cyan01 will test this later. And to KevinADC; no it isnt, actually the program going to count some textstring not really words. Have some sequences mixed with numbers and characters that I want to count.
 
OK, it does sound like classwork but if you say it's not that's good enough for me. :)
 
Just in case you want to take tab etc into consideration
Code:
if ($ARGV[0] eq "-s") {
   my $count ;
   for(my $i=1;$i<$#ARGV+1;$i++) {
         open(FH,$ARGV[$i]) ;
          while(my $content = <FH>) {
                   my @wcnt = split(/[\s|\t|\n|\f]/,$content) ;
                   $count += $#wcnt+1 ;
          }
          print "Words in $ARGV[$i] = $count\n" ;
          close(FH)    ;
          $count = 0;
   }
   
}



--------------------------------------------------------------------------
I never set a goal because u never know whats going to happen tommorow.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top