Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

help with a special regular expression.

Status
Not open for further replies.

weloki

Technical User
Feb 19, 2004
12
US
i am trying to use a regular expression to match all characters that are NOT enclosed within parentheses. I have tried the following, among others, on the file:
[^(\(.*\))]
[^\(+[A-Z]+.+[a-z]\)]
so i am trying to use the "not" operator [^ ] to select all characters that are not within the ( ) characters in the file. Please tell me what i am doing wrong or how i can accomplish this. Thanx
 
Here's something that should work for you - just substitute anything within parens with a delimiter, then split on the delimiter.

Code:
$str = "one(a)two(b)three(c)four(d)five";

$str =~ s/\(.*?\)/|/g;

@noparens = split (/\|/,$str);

print join "\n",@noparens
[\code]
 
Sure, but what if you have a more complex example, like this:

$str = "one('(a)')two(b)three(c)four(d)five";

Or, since he is reading from a file, how do you deal with the fact that parentheses might cross lines:

one(a
)two(b)

I guess the point is that trying to grab everything except what is between () is like trying to extract text (but not tags) from an html file. The simplistic regular expression will only work on the most trivial string structure. I know there are modules on cpan that deal with this issue specifically. What I don't know (and would like to know) is are there modules that allow you to change what the "tag" delimiter is? So, use one of the html parsers, but tell it to treat () as tags instead of <>.

 
Yes, thank you both for your reply. But, perhaps i should have been clearer on my intent here. i want to match the above mentioned text so that i can strip IT away from all the text that is within parentheses. How would i do this? i was asking how to merely select the text which is not in parenthesis so i could apply my own regexp to strip it away leaving only the characters inside the parentheses.
 
Could you post an example of the string or the text file that you wish to use the regular expression on?

Sean.
 
To strip away text not enclosed in parens. It's probably not perfect:
Code:
my $string = "it('(has)not')worked(exactly)as(I had)expected\n(so)there";

$string =~ s/(?:(?<=^)|(?<=\)))[^)(]*(?=\(|$)//gs;
 
i wish to say thanks to everyone who replied. it turns out that i got help on this since my last post. It appears that there is a number of ways to solve this problem; which i did not know since i am not fluent with regexes at all, as you can see. thanx again.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top