Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Using the substr() function

Status
Not open for further replies.

forces1

Programmer
Apr 5, 2007
29
NL
Hi all,
I've a searchfunction for a part of the database which is (most of the times) a large amount of text. The searchfunction works great but I only want to parse out the part of the text where the searchwords are found in. That also works for a one word-search, when I use this code:
Code:
my $string = $linkinfo[3];
	my $length = 200;
	my $fragment =  substr $string, (index($string, $search) - 15), $length;
	my $fragment = "...$fragment...";

	foreach $line (@searchterms) {	
	$fragment =~ s/($line)/<b>$1<\/b>/ig;
                  }
The foreach $line (@searchterms) makes the searchterm bold up. But I also want this to work for a double word-search, like double date. If those two words are not next to eachother in the text, it won't show up. So what I would like is something like this:
Code:
foreach $line (@searchterms) {
my $string = $linkinfo[3];
	my $length = 200;
	my $fragment =  substr $string, (index($string, $search) - 15), $length;
	my $fragment = "...$fragment...";

		
	$fragment =~ s/($line)/<b>$1<\/b>/ig;
                  }
so that it will search in the text foreach $line in the @searchterms and not only for the combination of those two words.

I hope that you can help me with this. You've proven to be the best place for help with these kinds of problems before.
 
Hey,

Heres something I made a while ago. Basically it allows 1 or 2 worded searches (i.e. Hello OR Hello World etc). If there is potentially any number of words to search for, then the loop system has to be adapted. The script then puts items in order of relevance i.e. if the search contains both words then its printed first, then if it contains just the first word then its printed second, and finally if its just the second word then its printed last. Again further adaptations can be made for example: The script doesn't take into account the number of times a word is contained i.e. the item "Hello Hello Hello Hello" would probably be more relevant because the first word is contained 4 times than the item "Hello World" although Hello World is printed before Hello Hello Hello Hello.

Anyways, this might help a bit with adapting your script so it can use 2 worded searches too...

Code:
#! /usr/bin/perl
use strict;
use CGI ':standard';

my (@both_words, @first_word, @second_word);

my $search_for = "Hello World";

my @lines = ('Hello big World', 'Hello you', 'Bye everyone', 'Bye World', 'Hello World World', 'Hello people');

my @split_search = split(/\ /, $search_for);

foreach (@lines) {

if (($_ =~ m/$split_search[0]/gi) && ($_ =~ m/$split_search[1]/gi)) {
push(@both_words, $_);
next;
}

elsif ($_ =~ m/$split_search[0]/gi) {
push(@first_word, $_);
next;
}

elsif ($_ =~ m/$split_search[1]/gi) {
push(@second_word, $_);
next;
}

}

print "Content-type: text/html\n\n";
foreach (@both_words) {
print "<p>$_";
}
foreach (@first_word) {
print "<p>$_";
}
foreach (@second_word) {
print "<p>$_";
}

Chris
 
Hmm, after reading your post again, not sure if my script will be any use. But it might give you an idea of how to adapt your script. And i'm 99% sure someone else will provide you with the perfect solution :)

Chris
 
personally, I don't understand the question. Maybe some examples of the data and the desired results would help clarify the question.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Ok, maybe it wasn't so clear at all
smiletiniest.gif
Even though, thanks chrismassey for your help.
What I mean is when I have a large amount of text in my database, like:
Code:
Hello World, this is me. Today is a wonderful day with a lot of sunshine and no rain. I might go for a little walk to the forest etc etc.
and someone uses my searchfunction to search for wonderful forest, I want this to be the output:
Code:
...is a [b]wonderful[/b] day with ... walk to the [b]forest[/b] etc...
like Google has.

With the code I provided, it will only show this kind of output when wonderful and forest are next to eachother. Because it searches for the combination of these two words and not for those two words apart. And that's what I want.

Thanks for all your help!
 
I think the simplest way to implement this is to use your one word search, and wrap it in a multi-word parser.

Although... your parser will match 'the' in 'theology' and 'absinthe' which is a drawback to using index()
 
That the parser will match words that contain this part, is ok. That doens't matter.
But where can I find something like that, a multi-word parser?
 
brigmar was suggesting that you simply use your "single word parser" and call it twice, once for each word. Then you simply add a return value that contains the index of the matched word in the page, and you can use that information to persist order of the matches as well.

In other words, just code it.

- Miller
 
assuming @searchterms has the words you are searching for, maybe something along these lines will work:

Code:
[url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]%terms[/blue] = [url=http://perldoc.perl.org/functions/map.html][black][b]map[/b][/black][/url] [red]{[/red][blue]$_[/blue] => [blue]$_[/blue][red]}[/red] [blue]@searchterms[/blue][red];[/red]
[black][b]my[/b][/black] [blue]$string[/blue] = [blue]$linkinfo[/blue][red][[/red][fuchsia]3[/fuchsia][red]][/red][red];[/red]
[black][b]my[/b][/black] [blue]$length[/blue] = [fuchsia]200[/fuchsia][red];[/red]
[black][b]my[/b][/black] [blue]$fragment[/blue] =  [url=http://perldoc.perl.org/functions/substr.html][black][b]substr[/b][/black][/url] [blue]$string[/blue], [red]([/red][url=http://perldoc.perl.org/functions/index.html][black][b]index[/b][/black][/url][red]([/red][blue]$string[/blue], [blue]$search[/blue][red])[/red] -  [fuchsia]15[/fuchsia][red])[/red], [blue]$length[/blue][red];[/red]
[blue]$fragment[/blue] = [red]"[/red][purple]...[blue]$fragment[/blue]...[/purple][red]"[/red][red];[/red]
[olive][b]foreach[/b][/olive] [black][b]my[/b][/black] [blue]$line[/blue] [red]([/red][blue]@searchterms[/blue][red])[/red] [red]{[/red]
   [blue]$fragment[/blue] =~ [red]s/[/red][purple]([blue]$line[/blue])[/purple][red]/[/red][purple]<b>[blue]$terms[/blue]{[blue]$1[/blue]}<[purple][b]\/[/b][/purple]b>[/purple][red]/[/red][red]ig[/red][red];[/red]
[red]}[/red]

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top