Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

regular expression help

Status
Not open for further replies.

justride

Programmer
Jan 9, 2004
251
US
Hello all,

I am attempting to search and replace some xml text using php's regular expression library.

I want to find all occurences of >some alpha text<
withing an xml string and replace with ><font...>some alpha text<

here is what I have so far

Code:
pattern = ">[a-zA-Z0-9]*<";
$replace = "><font ...>\\1</font><";
$html = preg_replace($pattern,$replace,$html);

php gives Warning: preg_replace() [function.preg-replace]: No ending delimiter '>' found

but a text editor finds all occurence of >< or >text<

any suggestions?

Thanks
 
Change:
Code:
$pattern = ">[a-zA-Z0-9]*<";
to have delimiters; such as:
Code:
$pattern = "/>[a-zA-Z0-9]*</";
 
Thanks for the help, I am noticing that echo \\1$ prints multiple instances. Is there any way to grab the whole text within >whole text< and assign to \\$1?

Code:
$pattern = "/>(^)*</";

that seems to be the syntax for the pattern but when i use

Code:
$pattern = ">(^)*<";
$replace = "/>test\\$1</";
$html = preg_replace($pattern,$replace,$xml);

I get >testtestest actual data from \\$1<

things like that. I just want to capture everything within >< and replace it.

Thanks


 
i may be misunderstanding. from your posts you want to take text withing tags and apply some green coloring to it using the <font> tag.

if so, then this code works for me
Code:
<?
$xml = "lots of text then a <sometag>then lots more text</sometag> then more text <somteag>and now some more</sometag>";

$pattern = "/(>)([a-zA-z0-9 ]*?)(<)/i";
$xml_r = preg_replace($pattern, '$1<font color="green">$2</font>$3',$xml);
echo $xml_r;
?>

but this won't work the way you intend if you have multiple tags within the document. instead you need to refine the pattern as follows:

Code:
$pattern = "/(<.*?>)([a-zA-z0-9 ]*?)(<\/.*?>)/i";
 
yes, I have multiple tags, is that what the *? is doing?

thanks for all the help thus far
 
let me decode the pattern for you
Code:
$pattern = "/(<.*?>)([a-zA-z0-9 ]*?)(<\/.*?>)/i";
the forward slash at the beginning and end of the string are pattern delimiters. they tell the engine where to find the pattern and thus where to find the modifiers.

the i at the end is a case insensitive modifier. not strictly needed here.

round brackets cause the data matched by the pattern within them to be captured in what is known as a backreference. the captured data is reusable by referencing it as $n where n is the (1-based) set of round brackets.

so the first round bracket set says:
look for a string that starts with "<" then has any character (the dot) repeated lazily (the ?) none or more times (the *) and then followed by a closing tag ">".
the second round bracket says:
look for a string that contains any alphabetical character or any number or a space (the "or" is created by using the square brackets) which is repeated none or man times; and
the third round bracket says look for a string that starts with a "</" and then contains any old text and finally a ">" the backslash is included because forward slash is a magic character in regex and so, if you want to use it literally, you must escape it. the backslash is the escape character.

together all three round brackets must be satisfied to get a match.

the replace syntax replaces the text within the three round brackets with
(i) the text in the first round bracket ($1) [remember the back references]
(ii) the <font color="green">
(iii) the text in the second round bracket ($2)

etc


FYi lazy vs greedy: a lazy match will stop at the first complete match whereas a greedy match will stop as the last complete match
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top