Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

search engin 1

Status
Not open for further replies.

dldl

Programmer
May 3, 2010
40
NZ
Hi;

I am trying to implement a search engin, and i want the most relevant results that must sit on top of the other ones. but my code does not always bring out right result, could anyone help, thanks

Code:
create table keyword_relevance 
( id tinyint not null primary key auto_increment 
, Description varchar(99) 
); 
insert into keyword_relevance (Description) values 
 ('I like to play games') 
,('One game, two games, eh') 
,('There''s no keyword') 
,('The games keyword is present only once') 
,('The games keyword is present only once in a longer sentence') 
,('The games keyword is present only once in a really really really really really really long sentence')
 ,('Games games') 
,('Games beautiful games') 
,('') 
,(null) 
,('Games') 
,('Games     ') 
,('     Games');



////////////////////

my code


Code:
$search_string=$_POST['search_string'];

$sql="select distinct count(*) as occurences, id, Description
from keyword_relevance
   where  (";
      
    
     while(list($key,$val)=each($search_string)){
       if($val<>" " and strlen($val) > 0){
           $sql .=" Description like '%" . $val . "%' or";
       }
     }

    $sql=substr($sql,0,(strLen($sql)-3));//this will eat the last OR

    $sql .= ") group by id order by occurences DESC";
 
Hi

Some details are missing :
[ul]
[li]What kind of database are you using ?[/li]
[li]What is received in $_POST['search_string'] ?[/li]
[li]What are you hoping to get as result ?[/li]
[/ul]
Please note that you are [tt]group[/tt]ing [tt]by[/tt] id and id is primary key. So count(*) will never ever be greater than 1. For the same reason the presence of [tt]distinct[/tt] clause makes no difference.

By the way, in reality you are protecting your SQL statement against SQL injection, right ?


Feherke.
 
1)What kind of database are you using ?
mysql
2)
What is received in $_POST['search_string'] ?
games keyword
3)What are you hoping to get as result ?
the most relevant with games keyword must sit on the top of the other one

>>>Please note that you are grouping by id and id is primary key. So count(*) will never ever be greater than 1. For the same reason the presence of distinct clause makes no difference.

it is true, but how i can calculator the occurrence.

>>in reality you are protecting your SQL statement against SQL injection, right ?

so i need to use the Session varible instead of the $_POST varible?
 
i try fulltext, it looks good

SELECT ... MATCH () AGAINST() as score from .... WHERE MATCH () AGAINST () order by score desc;


and i will change post data by using Get method
 
Hi

Trying the full-text search is a good idea.
dldl said:
>>in reality you are protecting your SQL statement against SQL injection, right ?

so i need to use the Session varible instead of the $_POST varible?
No, I mean to escape the special characters in the received string. In your earlier code that would look like :
Code:
[navy]$sql[/navy] [teal].=[/teal][green][i]" Description like '%"[/i][/green] [teal].[/teal] [highlight][COLOR=darkgoldenrod]mysql_real_escape_string[/color][teal]([/teal][/highlight][navy]$val[/navy][highlight][teal])[/teal][/highlight] [teal].[/teal] [green][i]"%' or"[/i][/green][teal];[/teal]
Of course, this will look different after you change to full-text search, but the protection will still be needed.


Feherke.
 
>>No, I mean to escape the special characters in the received >>string. In your earlier code that would look like :

Please check the following my code, is it ok

form.php
Code:
<form action="verify.php" method="GET" >

<textarea class="textarea"  rows="10" cols="70" name="content" onkeydown="limitText(this.form.kwant,this.form.countdown,250);" 
onkeyup="limitText(this.form.kwant,this.form.countdown,250);">
 <?php echo trim(stripslashes($_SESSION['content'])); ?>
</textarea>

<input class="input_2" type="submit" name="post" value="submit" id="button1"   />

</form>

verify.php
Code:
 if(isset($_REQUEST['post'])){

      $kwant=trim(stripslashes($_REQUEST['content']));
      $_SESSION['content']=htmlentities($content);

     if(empty($content)){
                header("location:form.php");
     }else{

         insert_data();
         header("location:index.php");
     }
 }

 function insert_data(){

      include("../condatabase.php");
      $content=trim(($_SESSION['content'])); 
      $content=addslashes($content);

     $q = "INSERT INTO mytable (content) values ('$content')";
     $result=mysql_query($q, $con);
      mysql_close($con);
   }
 
Hi
[ul]
[li]Calling [tt]stripslashes()[/tt] unconditionally is not a good idea. It should be used only if the [tt]magic_quotes_gpc[/tt] php.ini setting is [tt]On[/tt]. As the mentioned setting was removed in PHP 5.4.0, your function call will possibly destroy the received data after updating to 5.4.0 or newer.[/li]
[li]Do not use [tt]addslashes()[/tt] for escaping data used in SQL statements. In most cases will work, but for correctness a database-specific function should be used. Even the database-specific [tt]mysql_escape_string()[/tt] may be insufficient as it not cares about the database encoding. That is why [tt]mysql_real_escape_string()[/tt] should be used instead.[/li]
[/ul]
As from that fragment is not clear what $content variable does, it is difficult to say more, but the way you organize the encoding/decoding function calls looks abit chaotic to me. They are spread in too many places, following each variable's escaped/unescaped state changes over the script execution may become difficult.


Feherke.
 
Code:
As from that fragment is not clear what $content variable does

the form.php for user post a thread as we do here.

and i change the code as following, please check it is ok now.

Form.php
Code:
<form action="verify.php" method="GET" >

<textarea class="textarea"  rows="10" cols="70" name="content" onkeydown="limitText(this.form.kwant,this.form.countdown,250);" 
onkeyup="limitText(this.form.kwant,this.form.countdown,250);">
 <?php echo $_SESSION['content']; ?>
</textarea>

<input class="input_2" type="submit" name="post" value="submit" id="button1"   />

</form>

Verify.php

Code:
 if(isset($_REQUEST['post'])){

   if (get_magic_quotes_gpc()){
      $content=trim(stripslashes($_REQUEST['content']));
      $_SESSION['content']=htmlentities($content);
    }else{
      $content=trim(($_REQUEST['content']));
      $_SESSION['content']=htmlentities($content);
   }
  

     if(empty($content)){
                header("location:form.php");
     }else{

         insert_data();
         header("location:index.php");
     }
 }

 function insert_data(){

      include("../condatabase.php");
      $content=$_SESSION['content']; 
      $content=mysql_real_escape_string($content);

     $q = "INSERT INTO mytable (content) values ('$content')";
     $result=mysql_query($q, $con);
      mysql_close($con);
   }
 
if i post some data as following after i changed my code, it meets database attack. why?

"Man slugged $300 by police during dispute with local gym!@#$%^&*()_-+={}[]|\:";'<>,.?/
 
Hi

dldl said:
the form.php for user post a thread as we do here.
Then I assume the use of $kwant in the code posted at on Mar 12 15:50 was a typo.
dldl said:
it meets database attack
What do you mean by that ?

By the way, you are HTML encoding the string before [tt]insert[/tt]ing it into the database. Are you sure you want that ?

Feherke.
 
1.
Code:
By the way, you are HTML encoding the string before inserting it into the database. Are you sure you want that ?

Sorry, i should put it into html page

diplay.php
Code:
  function output($str)
  {
      return htmlspecialchars($str);
  }

   ..........
  ..........
   
   echo output(row['content']);


2.

>>>it meets database attack

sorry, i means i can not submit data like the following,sql injection attact, if i got too special characters

"Man slugged $300 by police during dispute with local gym!@#$%^&*()_-+={}[]|\:";'<>,.?/"

3.
Code:
Then I assume the use of $kwant in the code posted at on Mar 12 15:50 was a typo.

i just make a typing mistake, it should be $content.

Code:
4. by the way, why i can not submit huge string(about 2000 characters) via textarea, it looks like just accept small string(about 500 characters).
even i change the textarea as following

<textarea rows="10" cols="70" name="content"  id="content"  class="textarea" >

</textarea>

 
Hi

dldl said:
sorry, i means i can not submit data like the following,sql injection attact, if i got too special characters

"Man slugged $300 by police during dispute with local gym!@#$%^&*()_-+={}[]|\:";'<>,.?/"
So you enter that text in your [tt]textarea[/tt] with then [tt]name[/tt] content and... What do you mean "can not submit" ?
[ul]
[li]The browser not sends it ?[/li]
[li]The browser sends it, but altered ?[/li]
[li]The PHP script not receives it ?[/li]
[li]The PHP script receives it, but altered ?[/li]
[li]The MySQL database not stores it ?[/li]
[li]The MySQL database stores it, but altered ?[/li]
[/ul]
dldl said:
by the way, why i can not submit huge string(about 2000 characters) via textarea, it looks like just accept small string(about 500 characters).
There is no such limitation neither in HTML, HTTP nor PHP. Maybe some code alters it or your MySQL field size is set too small. ( Note that MySQL silently truncates long strings, ( at least by default, ) while other databases used to issue an error. )

Feherke.
 
Code:
$string = "Man slugged $300 by police during dispute with local gym!@#$%^&*()_-+={}[]|\:";'<>,.?/";
mysql_query ("insert into table (myfield1, myfield2) values (null, '" . [red]mysql_real_escape_string($string)[/red] ."')");
 
Code:
So you enter that text in your textarea with then name content and... What do you mean "can not submit" ?

the IE browser give me some massage with "HTTP 406 not acceptable"
 
Hi

A quick web search for that HTTP response code says that the request was Ok, but the response was malformed. So your PHP script should receive and [tt]insert[/tt] the data correctly in the database, only the output sent back to the browser is problematic.


Feherke.
 
haha, the malformed data have not problem now, i change post data method by using POST, because i could not pass huage data by using GET method.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top