Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

preg_replace() syntax off

Status
Not open for further replies.

justride

Programmer
Jan 9, 2004
251
US
I'm very imcompetent with this regexp stuff.

anyways I am using the function preg_replace to replace backslashes with forward slashes for all html links. This is what I have
Code:
return preg_replace('/<a[^>]*>/','whatgoeshere',$str);

The function spots the links in the str, but I dont know the syntax to use to replace only the backslashes with forward slashes. any idea???

THANKS!
 
I just had a very similar thread: thread434-966516
Read there for the detailed explanation of the expression. There is very little difference, which I will explain.
Code:
<?
$text = '<A name="whatever" class="myclass" href = "here.com\test.html" id="whatever">';
$pattern = '/(<a[^>]*href\s*=\s*[\'"])([^\'"]*)([\'"][^>]*>)/ie';
$newText = preg_replace($pattern,"'\\1'.str_replace(chr(92),'/','\\2').'\\3'",$text);
?>

The regex is basically the same, just that three parts are captured with subpatterns. They are re-assembled because we need the full anchor tag.
The pattern also has the added modifier e which means that in preg_replace the second parameter (which is the replacement specifier) will be interpreted as PHP code. That way we are able to reassemble part1 + str_replace() on part2 where backslash is converted to forward slash + part3.
Be careful with the final result and see if a stripslashes() is necessary.
 
That appeared to work except it added a slash before the qoutes in the start and end of the url
 
Be careful with the final result and see if a stripslashes() is necessary.

This is installation dependent (magic quotes). The only thing to add is the stripslashes($newText);
 
the final results look like this
<a href =\" >

is it possible to eliminate the baclskashes before the qoutes? no clue why it puts these in. I could do a string replace I guess. It be nicer to do it in the pregreplace so its done right. thanks though!
 
Yes, it is possible to eliminate the escaped doublequotes. The function for that is stripslashes().
Code:
$newText = stripslashes(preg_replace($pattern,"'\\1'.str_replace(chr(92),'/','\\2').'\\3'",$text));
 
but what about slashes that are needed? we dont want to strup all slashes
 
The function stripslashes() removes backslashes not forward slashes. Since we just replaced all backslashes with forward slashes there aren't any that would be wrongly removed.

 
Well there are some instances where I dont want to strip the back slashes. I did a replace on all \" instead of \.

Anyways, this script worked well. I came across a file where not all the links had there slashes reversed. I can post the file when I get to work tomorrow, but the example was something like
<a href = "...e\7..."> and the slash was not switched. I didnt have time to go through all the files but all appeared ok except this one. I'll post tomrorow, maybe there is a bug in the script?

Thanks
Chris
 
Hi,

I have numerous html files, here is an example of where a slash wasnt changed.

Code:
<a
href="PIWG's_774_Dockside_Harmonic_Database\774_Dockside_Harmonic_Database_2_Nov_04.xls">PIWG’s
SSN774 Dockside Harmonic Database<span style="mso-spacerun: yes">   </span></a>

the slsh between the e and the 7 remained. However I added other slashes in that link elsewhere and they were switched.

Any ideas?

Here is the file:

Thanks!
 
The slash is not replaced because there is a single quote inside the URL. As far as I am concerned I consider the URL faulty. Quotes, single or double, have no place in filenames.

I'll try to devise a regex that will process the above URL. Check back.
 
Ahhh, Nice find.

I agree with you completely! This is a wacky design; originally all the file names had spaces etc... Now I'm just cleaning up the links so this whole thing works in Linux.

Don’t worry about creating a regex, now that I know the problem, I can fix it in the very few places that I have seen and just go from there. It’s no biggie. This whole design will be scraped soon anyways as I am going to redo the whole thing using php and mysql.

Thanks!
 
Here's the result. Let me know if you want an explanation.
Code:
<?
$text = '<a href="PIWG\'s_774_Dockside_Harmonic_Database\774_Dockside_Harmonic_Database_2_Nov_04.xls" class="whatever">Link text<span style="mso-spacerun: yes">';
$pattern = '/(<a[^>]*href\s*=\s*)(?:"(.*?)"|\'(.*?)\')([^>]*>)/ie';
$newText = preg_replace($pattern,"'\\1\"'.str_replace(chr(92),'/','\\2').'\"\\3\\4'",$text);
print(stripslashes($newText));
?>
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top