Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

perl - word frequency coding help 1

Status
Not open for further replies.

mkosloff

Programmer
Feb 20, 2003
12
US
I am working on making a simple program that reads a file and does these things:

Counts the amount of lines that have "ing" in them
Counts the amount of lines that begin with a capital letter
Counts the amount of lines that begin with a space
Counts the amount of lines that end with a period
Counts the amount of lines that have words that begin with b and end with s
Counts the amount of lines that have "tt" somewhere in them

I know there should be a simple coding method to this, but cannot find it in my books. If anyone can help me, I'd really appreciate it! Here's what I have so far:

init();

dPrint();

exit (0);

sub init()
{
open(TT, $ARGV[0]) or die "Can't open $ARGV[0]. \n";

@lines = <TT>;
close(TT);
}



sub dPrint()
{
my @words = split /\s+/, join(&quot;&quot;,@lines);

my %wc = ();
foreach my $word (@words)
{
if ($word =~ /ing/)
{
$wc($word)++
}
}

print &quot;----\tWord \tCount ----\n&quot;;

foreach my $word (sort {$a cmp $b} keys %wc)
{
print &quot;\t$word\t$wc{$word}\n&quot;;
}
}

 
something like this

while(<>){
# that have &quot;ing&quot; in them
$ing ++ if /ing/;

# lines that begin with a capital letter
$start_cap ++ if /^[A-Z]/;

# lines that begin with a space
$space ++ if /^ /;

# lines that end with a period
$end_period ++ if /\.$/;

# lines that have words that begin with b and end with s
# errm :) dunno, left as an excersise for the student

# lines that have &quot;tt&quot; somewhere in them
$tt ++ if /tt/;
}

Mike

Want to get great answers to your Tek-Tips questions? Have a look at faq219-2884

It's like this; even samurai have teddy bears, and even teddy bears get drunk.
 
thanks! i am working on it now, but am having problems opening my file.. grrr
 
if you want to open a file explicitly rather than read standard input - try this instead of just while(<>)

open(F,'file.txt') || die;
while(<F>)

Mike

Want to get great answers to your Tek-Tips questions? Have a look at faq219-2884

It's like this; even samurai have teddy bears, and even teddy bears get drunk.
 
ok, Thanks again! Here's where I'm at, but it's not even compiling and I'm feeling stuck... it's getting a problem with the line above my first If statement. Thanks again for the help!


--------------------

dPrint();

exit (0);


sub dPrint()
{
open( F,'tt') || die

while (<F>)
{
my @word = split /\s+/, join(&quot;&quot;,@lines);

if ($word =~ /ing/)
{
$ing++;
}

if ($word =~ /^[A-Z]/)
{
$cap++;
}

if ($word =~ /^ /)
{
$space++;
}

if ($word =~ /\.$/)
{
$period++;
}

if ($word =~ /tt/)
{
$tts++;
}
}

print &quot;----\tWord \tCount ----\n&quot;;

foreach my $word (sort {$a cmp $b} keys 7)
{
print &quot;\ting = $ing\n&quot;;
print &quot;\tcap = $cap\n&quot;;
print &quot;\tspace = $space\n&quot;;
print &quot;\tperiod = $period\n&quot;;
print &quot;\ttts = $tts\n&quot;;
}
}
 
Two things: the line below needs a semicolon at the end (and probably good to die with an error message, like &quot;Cannot open 'tt': $!&quot;; The perl special variable $! contains the current perl error. Used in numeric context, it gives an error number, if you need to check for something specific. Otherwise, it's more commonly used in string context, where it contains an often useful error message, like &quot;No such file or directory&quot; or &quot;Operation not permitted&quot;.)
Code:
open( F,'tt') || die &quot;Cannot open 'tt': $!&quot;;
And two, @lines isn't defined anywhere before it gets used. Include
Code:
use strict;
and
Code:
use warnings;
at the top of your scripts, read the error messages they love to produce, and you'll get cleaner, more easily debugged code.

But the lack of semicolon is likely the error. Always look a few lines above and a few lines below the reported error line.

----------------------------------------------------------------------------------
...but I'm just a C man trying to see the light
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top