Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Turning links in text into truncated hyperlinks 1

Status
Not open for further replies.

Leozack

MIS
Oct 25, 2002
867
GB
Hi all
I have the following code which does a reasonable job of finding various styles of links within text (eg or just and turning them into links.
PHP:
function linkURLs($text) {
	$text = str_replace("/","/",$text);
	return preg_replace('@(http)?(s)?(://)?(([-\w]+\.)+([^\s]+)+[^,.\s])@', '<a href="http$2://$4">$1$2$3$4</a>', $text);
	$text = str_replace("/","&#47;",$text); // think this line is redundant now
}
My problem is if someone enters a really long link I'd like to truncate what actually shows on screen, eg " would still link there but show as just " etc
I've tried to do stuff but failed - I certainly don't seem able to tamper with the preg stuff eg "substring('$1$2$3$4',0,50)" doesn't work.
Can anyone shed light on what I thought would be a common thing to want to do (yet lots of people search/post online for how to do it?)

I also need to achieve the same thing with my javascript version ...
JavaScript:
	function LinkURLs(text) {
		text.replace('&#047;','/');
		var exp = /((?:(?:https?|ftp|file):\/\/|www\.|ftp\.)(?:\([-A-Z0-9+&@#\/%=~_|$?!:,.]*\)|[-A-Z0-9+&@#\/%=~_|$?!:,.])*(?:\([-A-Z0-9+&@#\/%=~_|$?!:,.]*\)|[A-Z0-9+&@#\/%=~_|$]))/ig
		return text.replace(exp,"<a href='$1'>$1</a>");
	}

_________________________________
Leozack
Code:
MakeUniverse($infinity,1,42);
 
Why do you need to get into regular expression? It only warrents a simple string slicing exercise.

test for http|https|ftp|whatever with if strpos($string, ....)

get the first 'X' number of characters (starting after '://'), get the last 'X' number of characters, (first space character after :// minus the length you want) concatenate the two with;

PHP:
$front. '....' .$rear



Chris.

Indifference will be the downfall of mankind, but who cares?
Time flies like an arrow, however, fruit flies like a banana.

Never mind this jesus character, stars had to die for me to live.
 
Why use regex? I think you answered that when you say "test for whatever", which is what the pattern matching is for? Also this has to only apply to the text of the link whilst turning it into the full link. So it's a bit more than a "search for START, concat with END" jobby

_________________________________
Leozack
Code:
MakeUniverse($infinity,1,42);
 
Hi

Use [tt]preg_replace_callback()[/tt] instead. And remove all those completely useless [tt]str_replace()[/tt] ( and [tt]String.replace()[/tt] calls ) that change slash with slash.

PHP:
[b]function[/b] [COLOR=orange]linkURLs[/color][teal]([/teal][navy]$text[/navy][teal]) {[/teal]
    [b]return[/b] [COLOR=orange]preg_replace_callback[/color][teal]([/teal][i][green]'@(http)?(s)?(://)?(([-\w]+\.)+([^\s]+)+[^,.\s])@'[/green][/i][teal],[/teal] [b]function[/b][teal]([/teal][navy]$match[/navy][teal]) {[/teal]
        [navy]$text[/navy] [teal]=[/teal] [COLOR=orange]substr[/color][teal]([/teal][navy]$match[/navy][teal][[/teal][purple]1[/purple][teal]] .[/teal] [navy]$match[/navy][teal][[/teal][purple]2[/purple][teal]] .[/teal] [navy]$match[/navy][teal][[/teal][purple]3[/purple][teal]] .[/teal] [navy]$match[/navy][teal][[/teal][purple]4[/purple][teal]],[/teal] [purple]0[/purple][teal],[/teal] [purple]50[/purple][teal]);[/teal]
        [b]return[/b] [i][green]"<a href=\"http{$match[2]}://{$match[4]}\">$text</a>"[/green][/i][teal];[/teal]
    [teal]},[/teal] [navy]$text[/navy][teal]);[/teal]
[teal]}[/teal]
JavaScript:
[b]function[/b] [COLOR=orange]LinkURLs[/color][teal]([/teal]text[teal]) {[/teal]
    [b]return[/b] text[teal].[/teal][COLOR=orange]replace[/color][teal]([/teal][fuchsia]/((?:(?:https?|ftp|file):\/\/|www\.|ftp\.)(?:\([-A-Z0-9+&@#\/%=~_|$?!:,.]*\)|[-A-Z0-9+&@#\/%=~_|$?!:,.])*(?:\([-A-Z0-9+&@#\/%=~_|$?!:,.]*\)|[A-Z0-9+&@#\/%=~_|$]))/ig[/fuchsia][teal],[/teal] [b]function[/b][teal]([/teal]match[teal]) {[/teal]
        [b]return[/b] [i][green]'<a href="'[/green][/i] [teal]+[/teal] match [teal]+[/teal] [i][green]'">'[/green][/i] [teal]+[/teal] match[teal].[/teal][COLOR=orange]slice[/color][teal]([/teal][purple]0[/purple][teal],[/teal] [purple]50[/purple][teal]) +[/teal] [i][green]'</a>'[/green][/i][teal];[/teal]
    [teal]});[/teal]
[teal]}[/teal]

Feherke.
feherke.ga
 
Leozack said:
Why use regex? I think you answered that when you say "test for whatever", which is what the pattern matching is for? Also this has to only apply to the text of the link whilst turning it into the full link. So it's a bit more than a "search for START, concat with END" jobby

Man, you are way over thinking it, I'll see if I can find the code I wrote for this very purpose a few years back, it will be in my archive somewhere.

Chris.

Indifference will be the downfall of mankind, but who cares?
Time flies like an arrow, however, fruit flies like a banana.

Never mind this jesus character, stars had to die for me to live.
 
Thanks Feherke, I've turned your function into this
PHP:
function LinkURLs($text) {
	$text = str_replace("&#47;","/",$text);
	return preg_replace_callback('@(http)?(s)?(://)?(([-\w]+\.)+([^\s]+)+[^,.\s])@', function($match) {
		if (strlen($match[4]) > 30) {
			$text = substr($match[4], 0, 20)."...".substr($match[4], -10, 10);
		} else {
			$text = $match[4];
		}
		return "<a href=\"http{$match[2]}://{$match[4]}\">$text</a>";
	}, $text);
}
which also removes unnecessary http:// stuff on the front when showing the links whilst including the end of the long URLs as ChrisHirst suggested. So giving it this text
Code:
Yay links work nicely [URL unfurl="true"]http://www.bbc.co.uk[/URL] or [URL unfurl="true"]www.itv.com[/URL] or [URL unfurl="true"]https://amzon.co.uk/this/that/lotsofotherstuff/more-links-that-go-on-for-ages/ahhhh[/URL]
I get this output
HTML:
Yay links work nicely <a href="[URL unfurl="true"]http://www.bbc.co.uk">www.bbc.co.uk</a>[/URL] or <a href="[URL unfurl="true"]http://www.itv.com">www.itv.com</a>[/URL] or <a href="[URL unfurl="true"]https://amzon.co.uk/this/that/lotsofotherstuff/more-links-that-go-on-for-ages/ahhhh">amzon.co.uk/this/tha...ages/ahhhh</a>[/URL]
and the information stored in the file (and read by this function later) is
Yay links work nicely http:&#47;&#47; or or https:&#47;&#47;amzon.co.uk&#47;this&#47;that&#47;lotsofotherstuff&#47;more-links-that-go-on-for-ages&#47;ahhhh
(you can see the / are escaped in my storage function which is why I unescape them before parsing for links). Longterm it will be stored in a DB but currently stored in json format in a flatfile.
I'm having issues getting your JS function to do the same results as the PHP version though? Given the same text string as above, I get
HTML:
Yay links work nicely [URL unfurl="true"]http://<a[/URL] href="[URL unfurl="true"]www.bbc.co.uk">www.bbc.co.uk</a>[/URL] or <a href="[URL unfurl="true"]www.itv.com">www.itv.com</a>[/URL] or [URL unfurl="true"]https://amzon.co.uk/this/that/lotsofotherstuff/more-links-that-go-on-for-ages/ahhhh[/URL]

There is also the followup issue that having edited the links appearance (removing some of it's text etc) my edit/delete functions break as they utilise the text from the field rather than trying to re-lookup the source. So I'd have to write a JS function that looks at the links href value and replaces all <a> tags with the contents of the href rather than it's text value. Currently it's just using jquery text() :/
eg
Code:
$("#edititem input").val(itemyours.children("span").text().trim());

_________________________________
Leozack
Code:
MakeUniverse($infinity,1,42);
 
Ok I've redone the fucntions as such
PHP:
function LinkURLs($text) {
	$text = str_replace("&#47;","/",$text);
	return preg_replace_callback('@((?:(?:http(?:s)?|ftp|file))?(?::\/\/))?((?:[-\w]+\.)+(?:[^,.!?\"\'<\s])+)@', function($match) {
		if (strlen($match[2]) > 30) {
			$text = substr($match[2], 0, 20)."...".substr($match[2], -10, 10);
		} else {
			$text = $match[2];
		}
		if (!$match[1]) {
			return "<a href=\"[URL unfurl="true"]http://{$match[/URL][2]}\" data=\"{$match[1]}{$match[2]}\">$text</a>";
		} else {
			return "<a href=\"{$match[1]}{$match[2]}\" data=\"{$match[1]}{$match[2]}\">$text</a>";
		}
	}, $text);
}
The linking is now a little simpler in its matches as it only returns the optional prefix part1 then the meat of the link part2 (which is shown and truncated if necessary) and the original link is stored as the attribute "data" (as if I've truncated the part2 and added a http:// on the front to make it a link when it didn't have one originally as part1, then neither are usable for edit/delete in the file as they don't contain the originals), hence I can now retrieve the 'data' attribute to "unlink" for edit/delete functionality.

JavaScript:
	function LinkURLs(text) {
		text.replace('&#047;','/');
		return text.replace(/((?:(?:http(?:s)?|ftp|file))?(?::\/\/))?((?:[-\w]+\.)+(?:[^,.!?\"\'<\s])+)/ig, function(match) {
			//return text.replace(exp,"<a href='" + match + "'>" + match + "</a>");
			console.log("match : "+match);
			return '<a href="' + match + '">' + match.slice(0, 50) + '</a>';
			//return "THIS"+match+"THAT";
		});
	}
The JS LinkURLs doesn't work quite as well as the PHP version.
So on the text
Code:
Yay links work nicely [URL unfurl="true"]http://www.bbc.co.uk[/URL] or [URL unfurl="true"]www.itv.com[/URL] or [URL unfurl="true"]https://amzon.co.uk/this/that/lotsofotherstuff/more-links-that-go-on-for-ages/ahhhh[/URL]
it results in
Code:
Yay links work nicely [URL unfurl="true"]http://<a[/URL] href="[URL unfurl="true"]www.bbc.co.uk">www.bbc.co.uk</a>[/URL] or <a href="[URL unfurl="true"]www.itv.com">www.itv.com</a>[/URL] or [URL unfurl="true"]https://<a[/URL] href="[URL unfurl="true"]https://amzon.co.uk/this/that/lotsofotherstuff/more-links-that-go-on-for-ages/ahhhh">amzon.co.uk/this/that/lotsofotherst</a>[/URL]

As for the "unlinking" to get the original back from the data attribute
PHP:
function UnlinkURLs($text) {
	// (?:<a)+.+?(?:data=")(.+?)"+.+?(?:<\/a>)
	return preg_replace_callback('@(?:<a)+.+?(?:data=")(.+?)"+.+?(?:<\/a>)@', function($match) {
		return "{$match[1]}";
	}, $text);
}
JavaScript:
	function LinkURLs(text) {
		text.replace('&#047;','/');
		return text.replace(/((?:(?:http(?:s)?|ftp|file))?(?::\/\/))?((?:[-\w]+\.)+(?:[^,.!?\"\'<\s])+)/ig, function(match) {
			//return text.replace(exp,"<a href='" + match + "'>" + match + "</a>");
			console.log("match : "+match);
			return '<a href="' + match + '">' + match.slice(0, 50) + '</a>';
			//return "THIS"+match+"THAT";
		});
	}
	function UnlinkURLs(text) {
		return text.replace(/<a.*?data="|".*?<\/a>/ig, "");
	}
I made the unlinkURLs function to try and replace the full <a href="link" data="original">text</a> with just the 'original' from the data attribute - and this seems to work.
I originally tried a regex like
Code:
<a.+?data="(.+?)"+.+?<\/a>
as seen in PHP version to replace it all with the value of data but it didn't seem to work testing in JS whereas my cruder regex of
Code:
<a.*?data="|".*?<\/a>
as seen in JS version to just delete around the data value seemed to work ok :|

So yeah - progress, but fiddly/messy and not quite perfect. All input welcome!
I've been testing with and
_________________________________
Leozack
Code:
MakeUniverse($infinity,1,42);
 
Any thoughts guys? Be great to tie this one up :S

_________________________________
Leozack
Code:
MakeUniverse($infinity,1,42);
 
Essentially I have a webpage that pulls from a flatfile, I'm trying to replace long links to be truncated text over the full link anchor, the php part works displaying the page. But on the page are buttons that let you edit/add items. To edit I needed to know the original text item which won't match with the new truncated links, so I keep the original in a "data" attrib of the <a> and truncate/fix up what is shown on the page - editing the replaces the links text with what's in the data attrib so that the original flatfile text item matches ok. But without refreshing the page, I am relying on JS to do the same truncating/fixing up as the php does for all new entries and edited entries being saved. That's the part that's not quite right here. I swear this is all something people must've been doing for decades but I've not found simply routines. I'd love to close this one off if anyone has this sussed :S

_________________________________
Leozack
Code:
MakeUniverse($infinity,1,42);
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top