Thanks to some help from PaulTEG and others, I have finally gotten my extraction program to extract the words I want, with the exception that I cannot extract as many words as I want after the word I want is found.
Basically, I am matching the word 'located' in a text document and I want to extract up to 10 words after it. I have used the (\w+) modifier however it only gives me one word after 'located'. When I try putting in a second (\w+) it actually takes away the first word. If anyone knows how to enumerate past the matched word up to ten words after, please help. This is my code thus far:
Now, this pulls and prints the word 'located' and one word after it. I would like to be able to pull up to ten words after it so instead of getting:
located at
located in
located around
located near
located beside
I would get:
located at the corner of Marbach and Maple
located in the deli inside the Supermarket on Broadway.
located around the corner at the bakery on Patterson Avenue.
Any help is greatly appreciated.
Basically, I am matching the word 'located' in a text document and I want to extract up to 10 words after it. I have used the (\w+) modifier however it only gives me one word after 'located'. When I try putting in a second (\w+) it actually takes away the first word. If anyone knows how to enumerate past the matched word up to ten words after, please help. This is my code thus far:
Code:
open(INFILE, "utah.txt") or die "The file cannot be found."'
open(OUTFILE, ">>utah_locations.txt);
$count = 0
while(<INFILE>) {
chomp($_);
if(/located (\w+)/g) {
print OUTFILE "located $1\n";
$count++;
}
}
print "There were $count matches";
close INFILE;
close OUTFILE;
Now, this pulls and prints the word 'located' and one word after it. I would like to be able to pull up to ten words after it so instead of getting:
located at
located in
located around
located near
located beside
I would get:
located at the corner of Marbach and Maple
located in the deli inside the Supermarket on Broadway.
located around the corner at the bakery on Patterson Avenue.
Any help is greatly appreciated.