Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regular Expression HELL!!!

Status
Not open for further replies.

KristianW

Programmer
Nov 5, 2003
9
0
0
NL
Hi all.

I know this isn't a JS specific question, but I have posted this in the ColdFusion Forum for a while and the only response I received was advice to try this board if I was unsuccessful there. So here I am...

I'm patching a website for a company, and part of this site uses RegExs to convert hyperlinks to become relative to the calling page. However, there seems to be a flaw in the RE used, as it some times appends a URL to the end of the current url. For example after the REreplace function runs, some links are good, and others look like:


This obviously will not work. So my plan is to create another RE that searches for this occourance (as well as www, or mailto links appending as this also happens on occasions). So I have created the following code (all one line):

Code:
<cfset PageContent=REreplacenocase(PageContent,"(href="")([^\?]*)\?([^(http|www|mailto|"")]*)((http|www|mailto)[^""]+)","\1\4","ALL")>

The first section of the RE ((href="")([^\?]*)\?) should match any hyperlink up to the querystring. The ([^(http|www|mailto|"")]*) section is designed to get zero or more of ANY character, unless it is http, mailto, www, or the closing quote for the hyperlink.

The next section ((http|www|mailto)[^""]+) then searches for data starting with http, mailto, or www, and then any more info up until the closing ". The problem is that this only seems to work some of the time. For example:

gets resolved to which is great. However, the following link is not recognised (I'm assuming this, as it is not being changed)


From what I can tell, there is nothing wrong with my RE (but then again my eyes are losing focus right now...) Shouldn't the ([^(http|www|mailto|"")]*) section return EVERYTHING other than what is in the square brackets? Ie, it should catch anything, or nothing, and then keep going, as long as it isn't http, www, mailto or the closing "

Can anyone help me with this? I just can't understand why it's not happening....

Thanks,
K.
 
I know this isn't a JS specific question, but I have posted this in the ColdFusion Forum for a while and the only response I received was advice to try this board if I was unsuccessful there. So here I am...

Sorry to be redundant, but if you don't get anything here, you can try the Perl board...

--Chessbot
 
I suck royally at reg ex's. That's why I use this site:



*cLFlaVA
----------------------------
Ham and Eggs walks into a bar and asks, "Can I have a beer please?"
The bartender replies, "I'm sorry, we don't serve breakfast.
 
Fantastic site!!! I disn't use reg ex'a much cos it took me ages to figure them out, but now i might use them a bit more :D

If it aint broke, redesign it!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top