anorakgirl
Programmer
I have written a PHP 'search engine' which trawls through my site and pulls out the words (+ keywords, title etc) and puts them in a database.
I want to flag certain parts of the page not to be included in the search - i.e. menu's which are repeated on each page. I have identified these in the html using comments:
<!--nosearch-->
Some Html here
<!--/nosearch-->
So I have a long string which contains all the text on the page, and I want to remove the bits between those comments. I've tried using a regular expression like this:
But I don't know what to put in the middle bit where the ???s are - it has to pick up everything (including html tags) except for the close <!--/nosearch-->.
Any tips?
Thanks!
~ ~
I want to flag certain parts of the page not to be included in the search - i.e. menu's which are repeated on each page. I have identified these in the html using comments:
<!--nosearch-->
Some Html here
<!--/nosearch-->
So I have a long string which contains all the text on the page, and I want to remove the bits between those comments. I've tried using a regular expression like this:
Code:
$text = eregi_replace("(<!--nosearch-->)(????)(<!--/nosearch-->)", " ",$text);
But I don't know what to put in the middle bit where the ???s are - it has to pick up everything (including html tags) except for the close <!--/nosearch-->.
Any tips?
Thanks!
~ ~