Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations dencom on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Knowing which regular expression matched 1

Status
Not open for further replies.

MacTommy

Programmer
Feb 26, 2007
116
NL
I know how to solve this in a not so elegant way, but somehow I have the feeling I am missing something obvious.

What I want is this: I've got a couple (possibly 1000's) of regexps. I want to go through a string, each time finding the first regexp that matches, and returning some relevant bit that matched, plus a string of which regexp it was that matched.

So, let's say if regexp 1 matched, I want to return $4 (relevant to that first regexp that is), and the string 'FIRST'. If the second one matched I want to return $2, and 'SECOND', e.g.

And preferrable of course I want to go through the string just once, and do the matching just once... :)
 
Well, I figured it out and just in case someone has the some problem and manages to stumble upon this page, I post the solution.
It lies in using the (?{ Perl code }) construct in your regular expressions and using $^N in this code.

This next bit works and does what I described above:
Code:
use strict;

my $sString = "blurb pppqqqqrrr and some weabcdeeff and some ";

my ($sType, $sValue);
while( (($sType, $sValue) = matcher(\$sString)) && defined $sType ) {
	print "type: $sType, value: $sValue\n";
}

sub matcher {
	my ($srString) = @_;

	if( $$srString =~
			/# First
			 (a(b(c(d(eee?ff))(?{$sType = 'First'; $sValue = $^N;}))))\ and\ some|
			 # Second one
			 ppp(q+(r+)(?{$sType = 'Second'; $sValue = $^N;}))\ and\ some
			 /gsx ) { # "
		return($sType, $sValue);
	}

	return(undef, undef);
}
Output:
Code:
type: Second, value: rrr
type: First, value: deeff

Read more on all this on perlre
 
Very cool and I hope I never have to use it. Makes my brain hurt a little just looking at it. ;)

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
You might want to make that a perl FAQ too.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
The reason I want to use it is because I am making kind of a lexicaliser like this. It uses regular expressions in stead of just plain strings (like a normal dictionary does) and it returns lexical category, plus the actual match. And it skips any part that doesn't match anything.
Does that clarify anything... [neutral]?!?

And just what would that FAQ be..?!?
Can you tell me something obscure of which the manual says it is expirimential and can be altered/omitted any release now, that I hope I never have to use..?!?
[ponder];-)
 
experimental that is...
Too bad you can't edit these posts once you clicked 'submit' (too soon...).
 
I could be very wrong ... but ...

I can't help thinking there's a neater way of achieving what you're looking to do (of which I'm not entirely sure either ;-))

The main reason I'm thinking this is because you have a lot of parentheses in your regexp which don't appear to be doing anything.

$^N is the text from the last capturing group that matched. For the first part of the regexp, that'll always be either (d(eee?ff)) or (eee?ff). For the second, it'll always be (r+), right?

Can you expand a little (perhaps with a real example) of what your lexicaliser is supposed to do. Maybe Parse::RecDescent (a very powerful module that's a total pain to learn but well worth it) might be useful for this kind of task.
 
Yes, I am sorry. The parentheses were just for the sake of the example (as I mentioned $2 and $4 in the first post). They don't make much sense apart from that in the example I admit...

I am actually using this together with Parse::Yapp, but Parse::RecDescent might be worth a look as well. Thanks!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top