Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations John Tel on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

count number of occurances of string in file 2

Status
Not open for further replies.

mgp77

Programmer
Nov 25, 2005
52
CA
Hello,

I realize that I can write a function to do this however I was wondering if there is an existing Perl function that will allow me to count the number of occurances of a particular string in a file. Currently I have the file opened I extract a string and then want to check in the remainder of the file to see if the string appears again. If it doesn't I want to do somethign with the string and then continue on with the next line and extract its string and so on. So ideally I don't want to have to open the file, extract the string, and then have to close it and open it again to check to see if the string appears more than once. I want to do it all in one step. If I open it twice I will lose my position in the file so that I won't be able to extract the next string. Any and all help is appreciated.

Thanks
 
there very well could be a module written to do this or a well known method. Can you post the code you have been using?
 
Here is what I'm trying but it doesn't appear to be working too well

Code:
use strict;
sub checkForMultipleLabels($);

my %labelNames = ();
open(FILE, "<$ARGV[0]") || die("Cannot open file $ARGV[0]");
open(OUT, ">labelsAndColumns.txt") || die("Cannot open output file");
open(OUT2, ">labels.txt") || die("Cannot open output file");

while (<FILE>)
{	
	if ($_ =~ /Column\s*:\s*(.+)\s{2}\(".*"\.".*""(.*)"\)/i)
	{
		my $count = checkForMultipleLabels($1);
		#label names will only be listed once
		#if (exists $labelNames{$1})
		#{
		#	if ($labelNames{$1} ne $2)
		#	{
		#		print OUT2 "The label $1 has the value $labelNames{$1} stored and also found the value $2\n";
		#	}
		#}
		if ($count eq "yes")
		{
			print OUT2 "The label $1 has the value $2 and also found other different values\n"
		}
		else
		{
			print OUT "$2\t$1\n";
			$labelNames{$1} = $2;
		}
	}

}
close FILE;
close OUT;
close OUT2;

#for my $key ( keys %labelNames ) 
#{
#	print OUT2 "$key\n";
#}

sub checkForMultipleLabels($)
{
	my $string = shift;
	my $count = 0;
	open(FILE2, "<$ARGV[0]") || die("Cannot open file $ARGV[0]");
	open(OUT3, ">test.txt") || die("Cannot open file test.txt");
	while(<FILE2>)
	{
		if ($_ =~ /Column\s*:\s*$string\s{2}\(".*"\.".*""(.*)"\)/i)
		{
			print OUT3 "String $string was found in line $_\n";
			$count++;
		}
		if ($count > 1)
		{
			close FILE2;
			close OUT3;
			return "yes";
		}
		
	}
	close OUT3;
	close FILE2;
	return "no";
}
 

You could use this.


my $string = "words";
my $var = qx[grep -r '$string' file.txt | wc -l];

once you get the string,
grep the string in the file and do a line count.
that should give you the number of times the string occurs in the file and then u can go to the next line and repeat the process.




 
I see. I'm working on Windows so would that solution still work considering windows doesn't have a the grep command. Also, my situation is a little more complex I actually want to search for the string in the file but is has to appear in a certain context and I only want to count it if one specific element in the context is different. I'll attempt to explain. My regex from above is

$_ =~ /Column\s*:\s*(.+)\s{2}\(".*"\.".*""(.*)"\)/i

Now I only want to count occurances where the string I'm searching with appears in the file where the $1 variable in the regex above would be and only if the $2 portion of the regex is different from a specific string (i.e. hello) Does that makes any sense. Can I somehow use the grep command to accomplish this? All help is appreciated!
 
you can use perls grep function instead of the operating systems grep function. But that's not to say it would work for your particular application. Your explanation makes sense, but it's still confusing, maybe some example lines of data, instead of the regexp, would help clear it up.
 

open(DAT,"data.txt") || die "die" ;

@local = <DAT>;

$var = scalar grep(/Data.*and/, @local);

print $var;

Replace 'Data.*and' with the string and regular expression
 
Can you explain to me what is happening in the line

@local = <DAT>;

$var = scalar grep(/Data.*and/, @local);

Is that opening the file and assigning each line in the file as an element in the array and then going through each element in the array one by one and seeing if the regex specified by /Data.*and/ is satisfied and then returning a total of the number of occurances. If so that is getting closer to what I want but I would want to have an if clause in there that would check each occurance that was found to see if a specific part of the regex, specified by the $2 portion in the post above, was a particular value and only count those that were not equal to said value.
 
you should try cygwin once you are working in windows so you can have whatever unix command you want.
Then it is just a mater of

grep -c 'my string' myfile.txt

or you can give us a sample of the file the string you are looking for and try to find a search pattern together


``The wise man doesn't give the right answers,
he poses the right questions.''
TIMTOWTDI
 
I found a great little application - Windows Grep that has the functionality you might need. Its a bit more powerful than standard unix grep in that it has replace, save, etc.
Its not the worlds fastest program, but for a few files, its perfectly adequate.


There are a number of grep for windows programs out there, but this one has worked well for me.

Cheers, - Happy searching,
George
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top