death to regular expressions! 5

cLFlaVA · Jan 23, 2005

Hi all. I have content as follows, entered into a textarea:

Code:

blah blah blah <b>blah</b> blah blah.
<div id="code">
blah blah blah <b>blah</b> blah blah.
</div>
blah blah blah <b>blah</b> blah blah.

And, using, I guess preg_replace or eregi_replace, I'd like to return the following:

Code:

blah blah blah blah blah blah.
<div id="code">
blah blah blah &lt;b&gt;blah&lt;/b&gt; blah blah.
</div>
blah blah blah blah blah blah.

Can someone help me here? Thanks.

*cLFlaVA
----------------------------
[tt]tastes great, less filling.[/tt]

Olavxxx · Jan 25, 2005

yes, I think drj478 is onto something here..

Olav Alexander Mjelde

http://www.volvo-power.net

Admin & Webmaster

DRJ478 · Jan 25, 2005

The getElementsByTagName() would be the appropriate way to access all <div> tags. Then you can inspect the attributes of each to determine if they are of the variety you want to handle.
As far as using the ID goes: you should be aware that (theoretically) IDs are unique. The way I saw it used in your code is probably coming from CSS usage. To keep IDs unique it's better to apply a class which has no uniqueness requirement - and still your CSS will apply the desired rendering.

cLFlaVA · Jan 25, 2005

Yeah, you're right - technically, my sample should have had class="code", not id="code. That was just an oversight in the sample I provided.

*cLFlaVA
----------------------------
[tt]tastes great, less filling.[/tt]

cLFlaVA · Jan 25, 2005

Ok, so let me think outloud for a second...

the following text is entered in a textarea:

Code:

blah fjdkf djfkl ;lkadsl uirei <b>kslfdfkdlf</b> ruei wioep ewiureiroei  rioe iroe iroe rieorie opoerip.

[ignore]<div id="code">[/ignore]
stupid blah blah <b>green</b> red dunkin donuts text
[ignore]</div>[/ignore]

blah <em>fjdkf djfkl</em> ;lkadsl uirei <b>kslfdfkdlf</b> ruei wioep ewiureiroei  rioe iroe iroe rieorie opoerip.

I could then user the [tt]loadHTML()[/tt] and [tt]getElementsByTagName()[/tt] and [tt]getAttribute()[/tt] functions to get all divs with a class of "code", and then use [tt]htmlspecialchars()[/tt] to make sure the html code displays on the screen as-is.

So the html would display as follows:

Code:

blah fjdkf djfkl ;lkadsl uirei [b]kslfdfkdlf[/b] ruei wioep ewiureiroei  rioe iroe iroe rieorie opoerip.

stupid blah blah <b>green</b> red dunkin donuts text

blah [i]fjdkf djfkl[/i] ;lkadsl uirei [b]kslfdfkdlf[/b] ruei wioep ewiureiroei  rioe iroe iroe rieorie opoerip.

*cLFlaVA
----------------------------
[tt]tastes great, less filling.[/tt]

DRJ478 · Jan 25, 2005

I've been thinking more about this.
First thought: why use <div> tags id HTML has a designated tag for such conten, namely <code> ... </code>
This would rule out all other <div> tags and make the whole much easier. Instead of declaring <div class="code"> a mere CSS definition of code will suffice.
Then the regex will work fine - as long as no nested <code> tags are present. If so, the "e" modifier will provide the desired functionality for evaluating the expression:

Code:

$text = 'blah fjdkf djfkl ;lkadsl uirei <b>kslfdfkdlf</b> ruei wioep ewiureiroei  rioe iroe iroe rieorie opoerip.

<code>
stupid blah blah <b>green</b> red <code>dunkin</code> donuts text
</code>

blah <em>fjdkf djfkl</em> ;lkadsl uirei <b>kslfdfkdlf</b> ruei wioep ewiureiroei  rioe iroe iroe rieorie opoerip.
';

$pattern = "/(<code>)(.*)(<\/code>)/esi";
$newText = preg_replace($pattern,"'\\1'.htmlspecialchars('\\2').'\\3'",$text);
echo $newText;

The DOM route only needs to be taken if nested tags are there - and then the tags by name of <code> still need to be examined if they have another <code> tag in their genealogy.
Nested <code> tags are painful, so if you can stay away from them, do so.
Hope this helps.

cLFlaVA · Jan 25, 2005

I do plan on staying away from nested <code> tags. The main idea is that I'll be writing css / javascript articles, and want to be able to switch from normal text to styled text (basically green courier new "pre" text).

Could you elaborate on "e"?

Thanks a lot

*cLFlaVA
----------------------------
[tt]tastes great, less filling.[/tt]

DRJ478 · Jan 25, 2005

The "e" pattern modifier - as in the example above, tells the regex engine to evaluate the replacement as a PHP expression.
In this cas it says (translated into human language):

Code:

"      ->open replacement expression
'//1'  ->content of the first parantheses (it's the <code> tag) it's quoted since it is a string
.      ->concatenate
htmlspecialchars('//2') -> run the htmlspecialchars function on the second match (the content of the tag)
.      -> concatenate
'//3'  -> the third match (the clong </code> tag)
"      -> close replacement expression

The /e just makes it that the whole replacement will be treated just like PHP code with the arguments taken from the match through backreferences.

Clear?

cLFlaVA · Jan 25, 2005

ah, instead of thinking that

"'\\1'.htmlspecialchars('\\2').'\\3'"

is the expression you'd like to try to replace it with, since there are double-quotes around the entire thing!

Thanks buddy, you've taught me a lot.

*cLFlaVA
----------------------------
[tt]tastes great, less filling.[/tt]

cLFlaVA · Jan 25, 2005

DRJ478 -

This worked as expected. Is there an additional modifier I can use so this is done to all instances of <code></code>?

I thought it was "g" for global, but that gave me an error.

*cLFlaVA
----------------------------
[tt]tastes great, less filling.[/tt]

cLFlaVA · Jan 25, 2005

Update:

After trying a lot of options, I finally found the complete regex.

Thanks again to everyone for your continued help.

Code:

$pattern = "/(<code>)(.*?)(<\\/code>)/esi";

*cLFlaVA
----------------------------
[tt]tastes great, less filling.[/tt]

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

death to regular expressions! 5

cLFlaVA

Programmer

Olavxxx

Programmer

DRJ478

IS-IT--Management

cLFlaVA

Programmer

cLFlaVA

Programmer

DRJ478

IS-IT--Management

cLFlaVA

Programmer

DRJ478

IS-IT--Management

cLFlaVA

Programmer

cLFlaVA

Programmer

cLFlaVA

Programmer

Similar threads

Part and Inventory Search

Sponsor