Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regex needs another set of ii

Status
Not open for further replies.

perlfan

Technical User
Apr 18, 2002
17
US
I would really appreciate it if you could take a look at the following regexes and let me know if there are any problems, including security issues. Handling URLs has proved to be tricky and I'm concerned I'm too forgiving in what I accept.

They are from a bulletin board script where the user can input a link in their post witrh the following formats:
[link]http://www.website.com[/url] or
[link=http://www.website.com]A Cool Site[/url]

The code of that handles the above exapmples is (note $post has been HTML esacped):

$post =~ s/\[link\]((?:ht|f)tp(?:s?):\/\/(?:\w+)\.(?:\S+?))\[\/link\]/<A HREF=&quot;$1&quot; TARGET=&quot;_blank&quot;>$1<\/A>/gis;

$post =~ s/\[link\=((?:ht|f)tp(?:s?):\/\/(?:\w+)\.(?:\S+?))\](.+?)\[\/link\]/<A HREF=&quot;$1&quot; TARGET=&quot;_blank&quot;>$2<\/A>/gis;
 
The following is an exert from RFC 1738, Uniform Resource Locators (URL) specification:
&quot;...Only alphanumerics [0-9a-zA-Z], the special characters &quot;$-_.+!*'(),&quot; [not including the quotes - ed], and reserved characters used for their reserved purposes may be used unencoded within a URL.&quot;
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top