Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

String replacement issue

Status
Not open for further replies.

Sleidia

Technical User
May 4, 2001
1,284
FR

Hi guys,

Still a newbie with preg/reg functions :(

I'm looking for a way to do a string replacement that replace x with y inside Z, ONLY IF x isn't surrounded by a <a></a> tag. ( ie: <a href="">x</a> )

Thanks a lot!!! :)
 
Hi

Something like this ?
PHP:
[navy]$text[/navy][teal]=[/teal][COLOR=darkgoldenrod]preg_replace[/color][teal]([/teal][green][i]',(<a\b.*?>)x(</a>),'[/i][/green][teal],[/teal][green][i]'\\1y\\2'[/i][/green][teal],[/teal][navy]$text[/navy][teal]);[/teal]

Feherke.
 
unless i read your question backwards, you are asking for a preg_replace to be available for all incidences of a string except where that string is part of a hyperlink?

I guess two approaches are available:

Approach 1
find all incidents of the string inside a hyperlink and replace them with a placeholder (assuming the links are all identical - if not things the links must be stored and the links dynamically replaced with a unique identifier)
find all incidences of the string (given that the link incidences are now removed) and transform to the replacement text
replace all placeholders with the previous values

Approach 2
it should be possible to construct a preg pattern to do a lookahead if it is enough just to test for the closing of the <a> link. A lookbehind would be perfect but lookbehind's cannot make use of the * and + modifiers and assuming that you do not KNOW what the makeup of the <a > link is I cannot think of a way to derive a decent pattern.
so here is a lookahead method with the limitations I mentioned previously

Code:
//to find the expression foobar NOT within a link
$pattern = '/foobar\s*(?!<\/a>)/i';

/*
examples:
The foobar went awry  = MATCH
The <a href='whatever'>foobar</a> = NO MATCH
The <a href="whatever'>  foobar  </a> = NO MATCH
The <span style="buzz">foobar</span> = MATCH
*/
 
Hi

Hmm... Seems I misunderstood the question. May I try again ?
PHP:
[navy]$test[/navy][teal]=[/teal][COLOR=darkgoldenrod]preg_replace_callback[/color][teal]([/teal][green][i]',(<?[^<]*?)x([^>]*>?),'[/i][/green][teal],[/teal][green][i]'hmm'[/i][/green][teal],[/teal][navy]$test[/navy][teal]);[/teal]

[b]function[/b] [COLOR=darkgoldenrod]hmm[/color][teal]([/teal][navy]$m[/navy][teal])[/teal]
[teal]{[/teal]
  [b]if[/b] [teal]([/teal][COLOR=darkgoldenrod]preg_match[/color][teal]([/teal][green][i]',<a\b.*?>,'[/i][/green][teal],[/teal][navy]$m[/navy][teal][[/teal][purple]1[/purple][teal]])[/teal] [teal]&&[/teal] [COLOR=darkgoldenrod]preg_match[/color][teal]([/teal][green][i]',</a>,'[/i][/green][teal],[/teal][navy]$m[/navy][teal][[/teal][purple]2[/purple][teal]]))[/teal] [b]return[/b] [navy]$m[/navy][teal][[/teal][purple]0[/purple][teal]];[/teal]
  [b]return[/b] [navy]$m[/navy][teal][[/teal][purple]1[/purple][teal]].[/teal][green][i]'y'[/i][/green][teal].[/teal][navy]$m[/navy][teal][[/teal][purple]2[/purple][teal]];[/teal]
[teal]}[/teal]

Feherke.
 
Thanks jpadie :)

Yeah, I also thought of the same approach as Approach 1 but I wanted to do something more elegant (for a change ahaha)

As for your Approach 2, I don't get what are its limitations and why it doesn't satisfy you.

Actually, what I'm trying to do is transform content words by clickable links as long as they are listed as keywords meta-tags in the database and as long as they are not already transformed.
 
aha. a kind of glossary script?

i think my approach2 might work providing that the keyword is the only thing within the link text. otherwise a callback is probably a good approach too although I am still a bit uncertain as to how you can determine whether there is a surrounding link tag. i have not spent enough time decyphering feherke's pattern yet.

however, i can't help feeling that this might be better done in javascript. in js you could traverse the DOM and examine each node type to determine whether you are in a hyperlink. if you used a library like jQuery you could probably do everything in a few lines of code. you could also apply some nice effects to the display of the glossary (or whatever) meanings
 
I too will have to decipher feherke's solution (note that it doesn't make me your equal ahaha).

As for the javascript idea, it sounds good technically but it would make it impossible for some spiders to see the generated links. The thing is that I'm doing all that stuff in order to optimize the pages linkage on my websites ... so, SEO purpose.
 
Hi

Sleidia said:
decipher feherke's solution
Huh ? That is just regular expression. The only difference is that you can perform more operations on the captured groups because the user function receives an array with them.

As a simple example, the following replacements are putting the vowels in parenthesis ( () ) :
PHP:
[navy]$test[/navy][teal]=[/teal][green][i]'Hello World'[/i][/green][teal];[/teal]

[gray]// 1[/gray]

[b]echo[/b] [COLOR=darkgoldenrod]preg_replace[/color][teal]([/teal][green][i]'/([aeiou])/'[/i][/green][teal],[/teal][green][i]'(\\1)'[/i][/green][teal],[/teal][navy]$test[/navy][teal]),[/teal][green][i]"\n"[/i][/green][teal];[/teal]

[gray]// 2[/gray]

[b]echo[/b] [COLOR=darkgoldenrod]preg_replace_callback[/color][teal]([/teal][green][i]'/([aeiou])/'[/i][/green][teal],[/teal][COLOR=darkgoldenrod]create_function[/color][teal]([/teal][green][i]'$what'[/i][/green][teal],[/teal][green][i]'return "(".$what[1].")";'[/i][/green][teal]),[/teal][navy]$test[/navy][teal]),[/teal][green][i]"\n"[/i][/green][teal];[/teal]

[gray]// 3[/gray]

[b]echo[/b] [COLOR=darkgoldenrod]preg_replace_callback[/color][teal]([/teal][green][i]'/([aeiou])/'[/i][/green][teal],[/teal][green][i]'gotthis'[/i][/green][teal],[/teal][navy]$test[/navy][teal]),[/teal][green][i]"\n"[/i][/green][teal];[/teal]

[b]function[/b] [COLOR=darkgoldenrod]gotthis[/color][teal]([/teal][navy]$what[/navy][teal])[/teal]
[teal]{[/teal]
  [b]return[/b] [green][i]'('[/i][/green][teal].[/teal][navy]$what[/navy][teal][[/teal][purple]1[/purple][teal]].[/teal][green][i]')'[/i][/green][teal];[/teal]
[teal]}[/teal]
Code:
H(e)ll(o) W(o)rld
H(e)ll(o) W(o)rld
H(e)ll(o) W(o)rld
As I see, you are more familiar with JavaScript, so here is the JavaScript way of doing the same as above :
JavaScript:
test[teal]=[/teal][green][i]'Hello World'[/i][/green]

[gray]// 1[/gray]

[COLOR=darkgoldenrod]alert[/color][teal]([/teal]test[teal].[/teal][COLOR=darkgoldenrod]replace[/color][teal]([/teal][fuchsia]/([aeiou])/g[/fuchsia][teal],[/teal][green][i]'($1)'[/i][/green][teal]))[/teal]

[gray]// 2[/gray]

[COLOR=darkgoldenrod]alert[/color][teal]([/teal]test[teal].[/teal][COLOR=darkgoldenrod]replace[/color][teal]([/teal][fuchsia]/([aeiou])/g[/fuchsia][teal],[/teal][b]function[/b][teal]([/teal]p0[teal],[/teal]p1[teal])[/teal][teal]{[/teal][b]return[/b] [green][i]'('[/i][/green][teal]+[/teal]p1[teal]+[/teal][green][i]')'[/i][/green][teal]}[/teal][teal]))[/teal]

[gray]// 3[/gray]

[b]function[/b] [COLOR=darkgoldenrod]gotthis[/color][teal]([/teal]p0[teal],[/teal]p1[teal])[/teal]
[teal]{[/teal]
  [b]return[/b] [green][i]'('[/i][/green][teal]+[/teal]p1[teal]+[/teal][green][i]')'[/i][/green]
[teal]}[/teal]

[COLOR=darkgoldenrod]alert[/color][teal]([/teal]test[teal].[/teal][COLOR=darkgoldenrod]replace[/color][teal]([/teal][fuchsia]/([aeiou])/g[/fuchsia][teal],[/teal]gotthis[teal]))[/teal]

Feherke.
 
feherke,
how do you determine using the callback whether the captured target is within an hyperlink?
 
Hi

jpadie said:
how do you determine using the callback whether the captured target is within an hyperlink?
Vaguely.

I captured a somehow relevant portion of text before and after the match. Then I will change the x only if the texts before and after are not containing [tt]a[/tt] tags.

In case of multiple tags ( [tt]<a><b>x</b></a>[/tt] ) and multiple x ( [tt]<a>xx</a>[/tt] ) will probably fail.

Hmm... Thinking again, probably a two step solution based on [tt]preg_match_all()[/tt] with [tt]PREG_OFFSET_CAPTURE[/tt] and [tt]substr_replace()[/tt] in a [tt]foreach[/tt] loop would be the best approach.

Sleidia, could we have some exact sample text ?

Feherke.
 
Hi

Thinking again, you probably have more glossary terms, so more than one replacement will be needed. So jpadie's 1st approach could be a winner.
[ul]
[li]Replace all [tt]<a\b.*?>.+?</a>[/tt] with placeholders like [ignore]"{LINK1}"[/ignore] and store the matched substring in an array.[/li]
[li]Replace all terms with the links as needed.[/li]
[li]Replace all placeholders with the saved original substring.[/li]
[/ul]
So [tt]preg_replace_callback()[/tt] will still be needed.
PHP:
[navy]$text[/navy][teal]=<<<[/teal]ENDOFTEXT
I love Perl and PHP.
I learned <a href="[URL unfurl="true"]http://perl.org/">Perl</a>[/URL] long before <a href="[URL unfurl="true"]http://php.net/">PHP</a>.[/URL]
<b>PHP</b>'s documentation became noob friendly before <a href="[URL unfurl="true"]http://perldoc.perl.org/"><b>Perl</b></a>'s.[/URL]
Hovewer, <a href="[URL unfurl="true"]http://cpan.org/">Perl[/URL] and Perl's module repository</a> is still unbeatable.
ENDOFTEXT[teal];[/teal]

[navy]$term[/navy][teal]=[/teal][b]array[/b][teal]([/teal]
  [green][i]'Perl'[/i][/green][teal]=>[/teal][green][i]'/my/perl/scripts'[/i][/green][teal],[/teal]
  [green][i]'PHP'[/i][/green][teal]=>[/teal][green][i]'/my/php/scripts'[/i][/green]
[teal]);[/teal]

[b]echo[/b] [green][i]"- - ORIGINAL - -\n$text\n"[/i][/green][teal];[/teal]

[navy]$orig[/navy][teal]=[/teal][b]array[/b][teal]();[/teal]
[navy]$text[/navy][teal]=[/teal][COLOR=darkgoldenrod]preg_replace_callback[/color][teal]([/teal][green][i]',<a\b.*?>.*?</a>,'[/i][/green][teal],[/teal][green][i]'getlink'[/i][/green][teal],[/teal][navy]$text[/navy][teal]);[/teal]

[b]foreach[/b] [teal]([/teal][navy]$term[/navy] [b]as[/b] [navy]$word[/navy][teal]=>[/teal][navy]$link[/navy][teal])[/teal]
  [navy]$text[/navy][teal]=[/teal][COLOR=darkgoldenrod]str_replace[/color][teal]([/teal][navy]$word[/navy][teal],[/teal][green][i]"<a href=\"$link\">$word</a>"[/i][/green][teal],[/teal][navy]$text[/navy][teal]);[/teal]

[b]foreach[/b] [teal]([/teal][navy]$orig[/navy] [b]as[/b] [navy]$number[/navy][teal]=>[/teal][navy]$link[/navy][teal])[/teal]
  [navy]$text[/navy][teal]=[/teal][COLOR=darkgoldenrod]str_replace[/color][teal]([/teal][green][i]"{LINK$number}"[/i][/green][teal],[/teal][navy]$link[/navy][teal],[/teal][navy]$text[/navy][teal]);[/teal]

[b]function[/b] [COLOR=darkgoldenrod]getlink[/color][teal]([/teal][navy]$match[/navy][teal])[/teal]
[teal]{[/teal]
  [b]global[/b] [navy]$orig[/navy][teal];[/teal]
  [navy]$orig[/navy][teal][]=[/teal][navy]$match[/navy][teal][[/teal][purple]0[/purple][teal]];[/teal]
  [b]return[/b] [green][i]'{LINK'[/i][/green][teal].([/teal][COLOR=darkgoldenrod]count[/color][teal]([/teal][navy]$orig[/navy][teal])-[/teal][purple]1[/purple][teal]).[/teal][green][i]'}'[/i][/green][teal];[/teal]
[teal]}[/teal]

[b]echo[/b] [green][i]"- - MODIFIED - -\n$text\n"[/i][/green][teal];[/teal]
Code:
[small]- - ORIGINAL - -
I love Perl and PHP.
I learned <a href="[URL unfurl="true"]http://perl.org/">Perl</a>[/URL] long before <a href="[URL unfurl="true"]http://php.net/">PHP</a>.[/URL]
<b>PHP</b>'s documentation became noob friendly before <a href="[URL unfurl="true"]http://perldoc.perl.org/"><b>Perl</b></a>'s.[/URL]
Hovewer, <a href="[URL unfurl="true"]http://cpan.org/">Perl[/URL] and Perl's module repository</a> is still unbeatable.
- - MODIFIED - -
I love <a href="/my/perl/scripts">Perl</a> and <a href="/my/php/scripts">PHP</a>.
I learned <a href="[URL unfurl="true"]http://perl.org/">Perl</a>[/URL] long before <a href="[URL unfurl="true"]http://php.net/">PHP</a>.[/URL]
<b><a href="/my/php/scripts">PHP</a></b>'s documentation became noob friendly before <a href="[URL unfurl="true"]http://perldoc.perl.org/"><b>Perl</b></a>'s.[/URL]
Hovewer, <a href="[URL unfurl="true"]http://cpan.org/">Perl[/URL] and Perl's module repository</a> is still unbeatable.[/small]
Could there be other special cases ?

Feherke.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top