Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

php autolink input 1

Status
Not open for further replies.

jasc2k

Programmer
Nov 2, 2005
113
GB
thread434-1618095

A revived thread of a great text autolinking script by jpaide.
It supports many domains / emails and youtube links.

Now to the reason for reposting:
A recent craze of links posted on my site like are causing problems facebook and twitter links are sometimes similar. Because (i assume) the link is never ended with a domian it means the <a> tag never gets closed and therefore links all text after it in the <div> lol

The code:
Code:
	/* Provide HTML for normal links */
	function _linkify($text){
		$text = str_replace('\"', '"', $text);
		if(!preg_match('/^(src|href|data|value)/ims', $text)):
			$protocols = array('http','https','ftp','file','gopher','mailto');
			$_protocols = implode('|',$protocols);
			//$_protocols = '('.implode('|',$protocols).')';
			if(!preg_match('/^('.$_protocols.')/ims', $text)):
				/* Check if the URL is an email */
				if (!$this->isEmail($text)):
					return '<a rel="nofollow" href="[URL unfurl="true"]http://'.$text.'"[/URL] target="_blank">'.$text.'</a>';
				else:
					$_text = str_replace('mailto:', '', $text);
					return '<a rel="nofollow" href="mailto:'.$_text.'">'.$_text.'</a>';
				endif;
			else:
				return '<a rel="nofollow" href="'.$text.'" target="_blank">'.$text.'</a>';
			endif;
		else:
			return $text;
		endif;
	}
	
	/**
    * autolink - this little function will
    * automatically hyperlink url's etc.
    */	
	function autolink($text,$noyt=0){
		$protocols = array('http','https','ftp','file','gopher','mailto');
		/* Specify the allowed domain names */
		$gTLDs = array('.info','.com','.edu','.org','.net','.mil');
		/* You may add quite a few others so long as they do not overlap */
		$newTLDs = array('.aero', '.biz', '.coop', '.info', '.museum', '.name', '.pro');
		$ukcc = array('.co.uk','.gov.uk','.ac.uk','.ltd.uk','.me.uk','.mod.uk','.net.uk','.nhs.uk','.nic.uk', '.org.uk','.parliament.uk',                    '.plc.uk','.police.uk','.sch.uk', '.bl.uk','.icnet.uk','.jet.uk','.nls.uk');
		$others = array('.tv','.eu');
		$domains = array_merge($gTLDs, $newTLDs, $ukcc, $others);
		$_protocols = array_map('preg_quote', $protocols);
		$_domains = array_map('preg_quote', $domains);
		/* First split the non word breaks */
		$pattern = '/((?<!\'|\"|=| )('.implode('|', $_protocols).')[^( |\.)])/imsu';
		$replace = ' \\1';
		$text = preg_replace($pattern, $replace, $text);
		$pattern = '/(?<! |\/|"\'|=)([URL unfurl="true"]www\.)/ims';[/URL]
		$text = preg_replace($pattern, $replace, $text);
		if($noyt == 0){
			/* Now to translate youtube links */
			$pattern = '/\s(http\:\/\/[URL unfurl="true"]www\.youtube\.com\/watch\?v\=(\w{11}))/imse';[/URL]
			$text = preg_replace($pattern, "\$this->_youTubeEmbed('\\2')", $text);
		}
		//preg_replace_callback($search, Array($this,"autolink"), $message);
		/* Now we should have clean links, recognise links */
		$pattern = '/([^(\s|\n)]*('.implode ('|', $_domains).')((\?|\/|&|#)[^(\s|\n)]*)?)/imsue';
		$text = preg_replace($pattern, "\$this->_linkify('\\1')", $text);
		return $text;
	}

$ukcc = array('.co.uk' - should I add a space into this array? but then it would get linked too?

All thoughts/comments appreciated...

- free mp3 downloads and streaming
 
my apologies but I meant that a carraige return <br /> after the link causes the problem

thanks

- free mp3 downloads and streaming
 
this is the code used to call the funtion (I am posting because it may be relavant)

Code:
		/* Formatting for HTML entries */
		$message = htmlspecialchars($message, ENT_NOQUOTES);
		/* Add line breaks with max of 3 in a row */
		//$message = preg_replace('/\r/', '', $message);
		//$message = preg_replace('/\n{4,}/', str_repeat('<br/>', 3), preg_replace('/\r/', '', $message));
		//$message = preg_replace('/\n{4,}/', str_repeat('<br/>', 3), $message);
		$message = strtr($message, array('\r\n' => '<br />', '\r' => '<br />', '\n' => '<br />')); 
		/* These must be added to pair_tags function */
		$message = str_replace('&lt;b&gt;','<b>',$message);
		$message = str_replace('&lt;/b&gt;','</b>',$message);
		$message = str_replace('&lt;i&gt;','<i>',$message);
		$message = str_replace('&lt;/i&gt;','</i>',$message);
		$message = str_replace('&lt;u&gt;','<u>',$message);
		$message = str_replace('&lt;/u&gt;','</u>',$message);
		
		/* Make sure these come before pair_tags */
		if(DISABLE_SMILEYS != 1 && $nosmi == 0){ 
			foreach ($smileys as $pattern => $result) {
				$message = str_ireplace($pattern,'<img class="smiley" src="'.IMG_PATH.'smileys/'.$result.'.png" alt="'.$pattern.'">',$message);
			}
		}
		if(DISABLE_LINKING != 1 && $nolink == 0){
			$message = $this->autolink($message,$noyt);
		}

- free mp3 downloads and streaming
 
can you post examples of text/url/uri that do not get correctly autolinked?
 
you may login to my site using testing (both u and p) and test all you like

lol you can ignore the 3 br's in a row code that never quite worked - I cant really see the whole issue for the life of me, I can see where it might remove the first < crocodile but not the closing one
thanks

- free mp3 downloads and streaming
 
infact anything incased in < > crocodiles gets removed which is good to stop users typing html formatted links but wheres it do it in the code lol

- free mp3 downloads and streaming
 
ok my bad I hope your still checking this jpadie just further down in my code I call the following function I have just tested it further and beleive this issue lies in here:

Code:
	/**
	* pair_tags - this function will close
	* any HTML tags left open (at end of post)
	*/
	function pair_tags($content) {
		/* Allowed tags */
		$tags = Array('b','i','u');
		foreach($tags as $tag) {
			$opentag = substr_count($content, '<'.$tag.'>');
			$closetag = substr_count($content, '</'.$tag.'>');
			$opentag = (empty($opentag) ? 0 : $opentag); 
			$closetag = (empty($closetag) ? 0 : $closetag);
			if(($opentag !== $closetag) && ($opentag-$closetag) > 0) {
				$content .= str_repeat('</'.$tag.'>', ($opentag-$closetag));
			}
		}
		return $content;
	}

< > and their contents are being removed.

thanks

- free mp3 downloads and streaming
 
how embarrissing i have tested and tested and tested and it is not an issue with any code I have posted maybe I will start a new thread.
I further noticed after removing all code there is only one function left and I can already see this could definately be the issue:
Code:
/* Prepares data for SQL entry */
	function sqlPrep($string,$except=0,$forms=0) {
		if(get_magic_quotes_gpc()) {
			$string = stripslashes($string);
		}
		if($except == 1){
			if($forms == 0){ $allowed = ALLOW_HTML; } else { $allowed = ALLOW_HTML.ALLOW_FORM; }
			$string = strip_tags($string,$allowed); // or less destructive htmlentities()
		} else {
			$string = strip_tags($string);
		}
		return mysql_real_escape_string($string);
	}

again apologies

- free mp3 downloads and streaming
 
it feels like i am going around in circles on this one. so I am now back to the autolink function

input:
carriage return immediatly after link

inside sql:
<a rel="nofollow" href=" target="_blank"> carriage return immediatly after link

the a link is never closed and the <b r> is wrong

any ideas?

- free mp3 downloads and streaming
 
at its simplest you could just do this

Code:
$text = str_ireplace('<br', ' <br', $text);

that would force a harmless space before all breaks. in fact there would be little harm in forcing a space before all tag openers.

there may be a wholly different approach to this linkification problem now that there is a php port of jQuery's dom selection and navigation routines.
 
ahh simple fixes - I like
Code:
$message = str_replace('<', ' <', $message);

I have applied this to all opening tags - I beleive that syntax is fine

php port of jQuery's dom selection
whats this? any links?

Many thanks for your help - I have looked at this so many different times I got bored. This simple fix seems to be working great :)

Thanks

- free mp3 downloads and streaming
 
something like this perhaps

Code:
pear channel-discover phpquery-pear.appspot.com  
pear install phpquery/phpQuery

the source page for phpQuery is at
i guess you could parse the dom with this class, find the text nodes that match the regex for an url, then wrap it in an anchor tag.

having just spent five minutes trying to build this, I suspect it's easier just to stick with the above code.
 
agreed lol
that looks intense - I could add some jQuery onto this but I love your code for this
though I dont fully understand all of it, it seems really thorough

thanks

- free mp3 downloads and streaming
 
over the last year I have become a huge fan of jQuery. I love the ease of access that the sparkle library provides, and the dom manipulation methods jive with the way I think things ought to be done.

i have always hated dom parsing and manipulation in php. using simplexml etc always seemed overkill for perhaps extracting a single piece of text. i try to avoid it usually and use regex where possible. finding phpQuery was a godsend (no offence meant to any believer or non-believer). I'm willing to accept the library's overhead for the occasional bit of dom parsing that is needed here or there.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top