Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Search for Text within page

Status
Not open for further replies.

jwoods7

Technical User
Feb 19, 2003
36
US
I'm researching a SIMPLE function (hopefully) that all it does is search for a string ($string) within a web url ($page).

I would like it to return the entire line that term is in, or something to that effect.

I'm more interested at this point with being able to find a term within a page.

The function would start like this:

function searchFor($string, $page) {
will return entire line where $string is on page $page

}

Thanks for help in advance, this is probably so simple it's difficult.
 
There is no builtin PHP function. But you've probably already reached that conclusion.

Building a user-defined function should not be that complicated. Newer versions of PHP have new functions that can make it even easier.

What version of PHP are you running?

Want the best answers? Ask the best questions: TANSTAAFL!
 
I believe it's version 4.2 or close to it. The server is hosted by another company, it was recently upgraded.

 
I know that it is in safe mode, not sure about url open though. Any way to check?
 
Write a script that consists of:

<?php
phpinfo();
?>

And point your web browser at it. The script will return more than you ever wanted to know about your PHP installation.

I'm interested in the exact version of PHP and the settings for safe_mode and allow_url_open.

Want the best answers? Ask the best questions: TANSTAAFL!
 
PHP Version 4.1.2

Allow Url Fopen: local value=1, master value=1

Safe Mode: On, On
 
Good. Then you should be able to use PHP's fopen command to read the web page served by the server at the URL, just as if you had the HTML source of the page in a file on your filesystem.

I have no idea if the following code will work as-is, but it should give you an idea where to go.

The function takes a string and a URL (in the form &quot; and returns either FALSE (if the page could not be opened) or an array which contains every line number of the web page where the string appeared. If the string did not appear on the page, the function will return an empty array.

The function does not take into account whether the string appears within an HTML tag or not, and will end up searching the HTTP headers as well as the HTML of the page.

function searchFor ($string, $page)
{
$handle = fopen ($page, 'r');
if ($handle !== FALSE)
{
$retval = array();
$counter = 1;
while ($line = fgets($handle))
{
if (strstr($string, $line) !== FALSE)
{
$retval[] = $counter;
}
$counter++;
}
}
else
{
$retval = FALSE;
}

return $retval;
}

Want the best answers? Ask the best questions: TANSTAAFL!
 
You can use regular expressions to limit your search to the <body>...</body> section of the HTML page.
To do that you need to have the page in a large string, rather than an array.

When you say you want the entire line, what do you assume is the delimiter of such a line? Remember, HTML is not white space specific and ignores linebreaks. A single line in the browser could have many linebreaks, spaces, you can't see but in the source code.

Please give us an idea how you would define the limits of a line.
 
Hmm...good question.
My goal is to try to create a script that searches a page for a particular word or term. I will return various info about that term, such as number of times it appears on the page, location between <title></title> tags...

I'm trying to create a mini page reporter that will give suggestions on how to make the page rank higher on search engines.

Since I'm still on the learning curve of php, I look for samples from people, play with them and try to put it all together. Any ideas would be helpful, I'll share the end results when finished.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top