Boolean operators in PERL regexp?
I am aware of OR ( | ), but is there logical NOT in the PERL regex syntax?
The task would be the following: to extract the relative links (i.e.. the href property of the "a" element) from an HTML file, even if it is not enclosed in quotation marks. This means I don't want to retrieve hyperlinks beginning with /, or # or javascript:
I would express the following string, and capture (or extract) the content of the second parenthesis.
( <a href=" OR <a href=) THEN NOT(/ OR # OR javascript: OR \s OR " ) THEN ( \s OR " )
The best regexp I could do is this, but it does not handle the case of / # javascript: etc.
/(<a href="|<a href=)([^"]*?)(\s|")/gi)
Does anyone know the answer, and help me? Thanks in advance,
Robert
I am aware of OR ( | ), but is there logical NOT in the PERL regex syntax?
The task would be the following: to extract the relative links (i.e.. the href property of the "a" element) from an HTML file, even if it is not enclosed in quotation marks. This means I don't want to retrieve hyperlinks beginning with /, or # or javascript:
I would express the following string, and capture (or extract) the content of the second parenthesis.
( <a href=" OR <a href=) THEN NOT(/ OR # OR javascript: OR \s OR " ) THEN ( \s OR " )
The best regexp I could do is this, but it does not handle the case of / # javascript: etc.
/(<a href="|<a href=)([^"]*?)(\s|")/gi)
Does anyone know the answer, and help me? Thanks in advance,
Robert