Turning links in text into truncated hyperlinks 1

Leozack · Dec 11, 2016

Hi all
I have the following code which does a reasonable job of finding various styles of links within text (eg

http://bbc.co.uk

or just

http://www.itv.com)

and turning them into links.

PHP:

function linkURLs($text) {
	$text = str_replace("&#47;","/",$text);
	return preg_replace('@(http)?(s)?(://)?(([-\w]+\.)+([^\s]+)+[^,.\s])@', '<a href="http$2://$4">$1$2$3$4</a>', $text);
	$text = str_replace("/","&#47;",$text); // think this line is redundant now
}

My problem is if someone enters a really long link I'd like to truncate what actually shows on screen, eg "

http://www.here.com/there/somewhere/longlink/blah.html"

would still link there but show as just "

http://www.here.com/there/..."

etc
I've tried to do stuff but failed - I certainly don't seem able to tamper with the preg stuff eg "substring('$1$2$3$4',0,50)" doesn't work.
Can anyone shed light on what I thought would be a common thing to want to do (yet lots of people search/post online for how to do it?)

I also need to achieve the same thing with my javascript version ...

JavaScript:

	function LinkURLs(text) {
		text.replace('&#047;','/');
		var exp = /((?:(?:https?|ftp|file):\/\/|www\.|ftp\.)(?:\([-A-Z0-9+&@#\/%=~_|$?!:,.]*\)|[-A-Z0-9+&@#\/%=~_|$?!:,.])*(?:\([-A-Z0-9+&@#\/%=~_|$?!:,.]*\)|[A-Z0-9+&@#\/%=~_|$]))/ig
		return text.replace(exp,"<a href='$1'>$1</a>");
	}

_________________________________
Leozack

Code:

MakeUniverse($infinity,1,42);

ChrisHirst · Dec 11, 2016

Why do you need to get into regular expression? It only warrents a simple string slicing exercise.

test for http|https|ftp|whatever with if strpos($string, ....)

get the first 'X' number of characters (starting after '://'), get the last 'X' number of characters, (first space character after :// minus the length you want) concatenate the two with;

PHP:

$front. '....' .$rear

Chris.

Indifference will be the downfall of mankind, but who cares?
Time flies like an arrow, however, fruit flies like a banana.

Never mind this jesus character, stars had to die for me to live.

Leozack · Dec 11, 2016

Why use regex? I think you answered that when you say "test for whatever", which is what the pattern matching is for? Also this has to only apply to the text of the link whilst turning it into the full link. So it's a bit more than a "search for START, concat with END" jobby

_________________________________
Leozack

Code:

MakeUniverse($infinity,1,42);

feherke · Dec 12, 2016

Hi

Use [tt]preg_replace_callback()[/tt] instead. And remove all those completely useless [tt]str_replace()[/tt] ( and [tt]String.replace()[/tt] calls ) that change slash with slash.

PHP:

[b]function[/b] [COLOR=orange]linkURLs[/color][teal]([/teal][navy]$text[/navy][teal]) {[/teal]
    [b]return[/b] [COLOR=orange]preg_replace_callback[/color][teal]([/teal][i][green]'@(http)?(s)?(://)?(([-\w]+\.)+([^\s]+)+[^,.\s])@'[/green][/i][teal],[/teal] [b]function[/b][teal]([/teal][navy]$match[/navy][teal]) {[/teal]
        [navy]$text[/navy] [teal]=[/teal] [COLOR=orange]substr[/color][teal]([/teal][navy]$match[/navy][teal][[/teal][purple]1[/purple][teal]] .[/teal] [navy]$match[/navy][teal][[/teal][purple]2[/purple][teal]] .[/teal] [navy]$match[/navy][teal][[/teal][purple]3[/purple][teal]] .[/teal] [navy]$match[/navy][teal][[/teal][purple]4[/purple][teal]],[/teal] [purple]0[/purple][teal],[/teal] [purple]50[/purple][teal]);[/teal]
        [b]return[/b] [i][green]"<a href=\"http{$match[2]}://{$match[4]}\">$text</a>"[/green][/i][teal];[/teal]
    [teal]},[/teal] [navy]$text[/navy][teal]);[/teal]
[teal]}[/teal]

JavaScript:

[b]function[/b] [COLOR=orange]LinkURLs[/color][teal]([/teal]text[teal]) {[/teal]
    [b]return[/b] text[teal].[/teal][COLOR=orange]replace[/color][teal]([/teal][fuchsia]/((?:(?:https?|ftp|file):\/\/|www\.|ftp\.)(?:\([-A-Z0-9+&@#\/%=~_|$?!:,.]*\)|[-A-Z0-9+&@#\/%=~_|$?!:,.])*(?:\([-A-Z0-9+&@#\/%=~_|$?!:,.]*\)|[A-Z0-9+&@#\/%=~_|$]))/ig[/fuchsia][teal],[/teal] [b]function[/b][teal]([/teal]match[teal]) {[/teal]
        [b]return[/b] [i][green]'<a href="'[/green][/i] [teal]+[/teal] match [teal]+[/teal] [i][green]'">'[/green][/i] [teal]+[/teal] match[teal].[/teal][COLOR=orange]slice[/color][teal]([/teal][purple]0[/purple][teal],[/teal] [purple]50[/purple][teal]) +[/teal] [i][green]'</a>'[/green][/i][teal];[/teal]
    [teal]});[/teal]
[teal]}[/teal]

Feherke.
feherke.ga

ChrisHirst · Dec 12, 2016

Leozack said:
Why use regex? I think you answered that when you say "test for whatever", which is what the pattern matching is for? Also this has to only apply to the text of the link whilst turning it into the full link. So it's a bit more than a "search for START, concat with END" jobby

Man, you are way over thinking it, I'll see if I can find the code I wrote for this very purpose a few years back, it will be in my archive somewhere.

Chris.

Indifference will be the downfall of mankind, but who cares?
Time flies like an arrow, however, fruit flies like a banana.

Never mind this jesus character, stars had to die for me to live.

Leozack · Dec 12, 2016

Thanks Feherke, I've turned your function into this

PHP:

function LinkURLs($text) {
	$text = str_replace("&#47;","/",$text);
	return preg_replace_callback('@(http)?(s)?(://)?(([-\w]+\.)+([^\s]+)+[^,.\s])@', function($match) {
		if (strlen($match[4]) > 30) {
			$text = substr($match[4], 0, 20)."...".substr($match[4], -10, 10);
		} else {
			$text = $match[4];
		}
		return "<a href=\"http{$match[2]}://{$match[4]}\">$text</a>";
	}, $text);
}

which also removes unnecessary http:// stuff on the front when showing the links whilst including the end of the long URLs as ChrisHirst suggested. So giving it this text

Code:

Yay links work nicely [URL unfurl="true"]http://www.bbc.co.uk[/URL] or [URL unfurl="true"]www.itv.com[/URL] or [URL unfurl="true"]https://amzon.co.uk/this/that/lotsofotherstuff/more-links-that-go-on-for-ages/ahhhh[/URL]

I get this output

HTML:

Yay links work nicely <a href="[URL unfurl="true"]http://www.bbc.co.uk">www.bbc.co.uk</a>[/URL] or <a href="[URL unfurl="true"]http://www.itv.com">www.itv.com</a>[/URL] or <a href="[URL unfurl="true"]https://amzon.co.uk/this/that/lotsofotherstuff/more-links-that-go-on-for-ages/ahhhh">amzon.co.uk/this/tha...ages/ahhhh</a>[/URL]

and the information stored in the file (and read by this function later) is
Yay links work nicely http://

http://www.bbc.co.uk

or

http://www.itv.com

or https://amzon.co.uk/this/that/lotsofotherstuff/more-links-that-go-on-for-ages/ahhhh
(you can see the / are escaped in my storage function which is why I unescape them before parsing for links). Longterm it will be stored in a DB but currently stored in json format in a flatfile.
I'm having issues getting your JS function to do the same results as the PHP version though? Given the same text string as above, I get

HTML:

Yay links work nicely [URL unfurl="true"]http://<a[/URL] href="[URL unfurl="true"]www.bbc.co.uk">www.bbc.co.uk</a>[/URL] or <a href="[URL unfurl="true"]www.itv.com">www.itv.com</a>[/URL] or [URL unfurl="true"]https://amzon.co.uk/this/that/lotsofotherstuff/more-links-that-go-on-for-ages/ahhhh[/URL]

There is also the followup issue that having edited the links appearance (removing some of it's text etc) my edit/delete functions break as they utilise the text from the field rather than trying to re-lookup the source. So I'd have to write a JS function that looks at the links href value and replaces all <a> tags with the contents of the href rather than it's text value. Currently it's just using jquery text() :/
eg

Code:

$("#edititem input").val(itemyours.children("span").text().trim());

_________________________________
Leozack

Code:

MakeUniverse($infinity,1,42);

Leozack · Dec 12, 2016

Ok I've redone the fucntions as such

PHP:

function LinkURLs($text) {
	$text = str_replace("&#47;","/",$text);
	return preg_replace_callback('@((?:(?:http(?:s)?|ftp|file))?(?::\/\/))?((?:[-\w]+\.)+(?:[^,.!?\"\'<\s])+)@', function($match) {
		if (strlen($match[2]) > 30) {
			$text = substr($match[2], 0, 20)."...".substr($match[2], -10, 10);
		} else {
			$text = $match[2];
		}
		if (!$match[1]) {
			return "<a href=\"[URL unfurl="true"]http://{$match[/URL][2]}\" data=\"{$match[1]}{$match[2]}\">$text</a>";
		} else {
			return "<a href=\"{$match[1]}{$match[2]}\" data=\"{$match[1]}{$match[2]}\">$text</a>";
		}
	}, $text);
}

The linking is now a little simpler in its matches as it only returns the optional prefix part1 then the meat of the link part2 (which is shown and truncated if necessary) and the original link is stored as the attribute "data" (as if I've truncated the part2 and added a http:// on the front to make it a link when it didn't have one originally as part1, then neither are usable for edit/delete in the file as they don't contain the originals), hence I can now retrieve the 'data' attribute to "unlink" for edit/delete functionality.

JavaScript:

	function LinkURLs(text) {
		text.replace('&#047;','/');
		return text.replace(/((?:(?:http(?:s)?|ftp|file))?(?::\/\/))?((?:[-\w]+\.)+(?:[^,.!?\"\'<\s])+)/ig, function(match) {
			//return text.replace(exp,"<a href='" + match + "'>" + match + "</a>");
			console.log("match : "+match);
			return '<a href="' + match + '">' + match.slice(0, 50) + '</a>';
			//return "THIS"+match+"THAT";
		});
	}

The JS LinkURLs doesn't work quite as well as the PHP version.
So on the text

Code:

Yay links work nicely [URL unfurl="true"]http://www.bbc.co.uk[/URL] or [URL unfurl="true"]www.itv.com[/URL] or [URL unfurl="true"]https://amzon.co.uk/this/that/lotsofotherstuff/more-links-that-go-on-for-ages/ahhhh[/URL]

it results in

Code:

Yay links work nicely [URL unfurl="true"]http://<a[/URL] href="[URL unfurl="true"]www.bbc.co.uk">www.bbc.co.uk</a>[/URL] or <a href="[URL unfurl="true"]www.itv.com">www.itv.com</a>[/URL] or [URL unfurl="true"]https://<a[/URL] href="[URL unfurl="true"]https://amzon.co.uk/this/that/lotsofotherstuff/more-links-that-go-on-for-ages/ahhhh">amzon.co.uk/this/that/lotsofotherst</a>[/URL]

As for the "unlinking" to get the original back from the data attribute

PHP:

function UnlinkURLs($text) {
	// (?:<a)+.+?(?:data=")(.+?)"+.+?(?:<\/a>)
	return preg_replace_callback('@(?:<a)+.+?(?:data=")(.+?)"+.+?(?:<\/a>)@', function($match) {
		return "{$match[1]}";
	}, $text);
}

JavaScript:

	function LinkURLs(text) {
		text.replace('&#047;','/');
		return text.replace(/((?:(?:http(?:s)?|ftp|file))?(?::\/\/))?((?:[-\w]+\.)+(?:[^,.!?\"\'<\s])+)/ig, function(match) {
			//return text.replace(exp,"<a href='" + match + "'>" + match + "</a>");
			console.log("match : "+match);
			return '<a href="' + match + '">' + match.slice(0, 50) + '</a>';
			//return "THIS"+match+"THAT";
		});
	}
	function UnlinkURLs(text) {
		return text.replace(/<a.*?data="|".*?<\/a>/ig, "");
	}

I made the unlinkURLs function to try and replace the full <a href="link" data="original">text</a> with just the 'original' from the data attribute - and this seems to work.
I originally tried a regex like

Code:

<a.+?data="(.+?)"+.+?<\/a>

as seen in PHP version to replace it all with the value of data but it didn't seem to work testing in JS whereas my cruder regex of

Code:

<a.*?data="|".*?<\/a>

as seen in JS version to just delete around the data value seemed to work ok :|

So yeah - progress, but fiddly/messy and not quite perfect. All input welcome!
I've been testing with

http://www.regexpal.com/

and

https://www.debuggex.com/

_________________________________
Leozack

Code:

MakeUniverse($infinity,1,42);

Leozack · Dec 15, 2016

Any thoughts guys? Be great to tie this one up :S

_________________________________
Leozack

Code:

MakeUniverse($infinity,1,42);

Leozack · Feb 20, 2017

Essentially I have a webpage that pulls from a flatfile, I'm trying to replace long links to be truncated text over the full link anchor, the php part works displaying the page. But on the page are buttons that let you edit/add items. To edit I needed to know the original text item which won't match with the new truncated links, so I keep the original in a "data" attrib of the <a> and truncate/fix up what is shown on the page - editing the replaces the links text with what's in the data attrib so that the original flatfile text item matches ok. But without refreshing the page, I am relying on JS to do the same truncating/fixing up as the php does for all new entries and edited entries being saved. That's the part that's not quite right here. I swear this is all something people must've been doing for decades but I've not found simply routines. I'd love to close this one off if anyone has this sussed :S

_________________________________
Leozack

Code:

MakeUniverse($infinity,1,42);

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Turning links in text into truncated hyperlinks 1

Leozack

MIS

ChrisHirst

IS-IT--Management

Leozack

MIS

feherke

Programmer

ChrisHirst

IS-IT--Management

Leozack

MIS

Leozack

MIS

Leozack

MIS

Leozack

MIS

Similar threads

Part and Inventory Search

Sponsor