Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regex to assign score? 2

Status
Not open for further replies.

domster

Programmer
Oct 23, 2003
30
0
0
GB
Hi, I have an interesting regex puzzle I hope you might be able to help me with.

What I want to do is assign a 'score' to various strings, based on their prefix. So if a file starts with 'UKL' or 'USL', it receives a score of 10; if it starts with 'EUK' or 'EUS' it receives a score of 9, if 'CA2UK' 8...etc. The scores themselves are arbitrary, and there's no pattern as such to the prefixes. What I'd like to know is whether there's a better way to do this than my rather prosaic:

Code:
if ($string =~ m/^USL|UKL/) {
   $score = 10;
} elsif ($string =~ m/^EUK|EUS/) {
   $score = 9;
} elsif (....

etc. Can all prefixes be combined into one regex, like m/^USL|UKL|EUK|EUS|CA2UK.../ and the score returned within that?

I should explain that the score is used to compare two strings, the one with the higher score being chosen over the lower-scoring. Maybe the whole idea of score-assignment is the wrong way to go about this?

Thanks for your time!
 
Throw your prefixes and scores into a hash.

Code:
[gray]#!/usr/bin/perl[/gray]
[url=http://perldoc.perl.org/functions/use.html][black][b]use[/b][/black][/url] [green]strict[/green][red];[/red]

[url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]%scores[/blue] = [red]([/red] [purple]UKL[/purple] => [fuchsia]10[/fuchsia] , [purple]USL[/purple] => [fuchsia]10[/fuchsia], [purple]EUK[/purple] => [fuchsia]9[/fuchsia], [purple]EUS[/purple] => [fuchsia]5[/fuchsia], [purple]AUX[/purple] => [fuchsia]5[/fuchsia] [red])[/red][red];[/red] [gray][i]#..etc[/i][/gray]
[black][b]my[/b][/black] [blue]$NotFoundScore[/blue] = -[fuchsia]1[/fuchsia][red];[/red]
[black][b]my[/b][/black] [red]([/red][blue]$data[/blue],[blue]$score[/blue][red])[/red][red];[/red]

[olive][b]while[/b][/olive][red]([/red][blue]$data[/blue] = <DATA>[red])[/red] [red]{[/red]
	[url=http://perldoc.perl.org/functions/chomp.html][black][b]chomp[/b][/black][/url][red]([/red][blue]$data[/blue][red])[/red][red];[/red]
	[blue]$score[/blue] = [url=http://perldoc.perl.org/functions/exists.html][black][b]exists[/b][/black][/url] [blue]$scores[/blue][red]{[/red][blue]$data[/blue][red]}[/red] ? [blue]$scores[/blue][red]{[/red][blue]$data[/blue][red]}[/red] : [blue]$NotFoundScore[/blue] [red];[/red]
	[url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple][blue]$data[/blue] = [blue]$score[/blue][purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
[red]}[/red]

[teal]__DATA__[/teal]
[teal]UKL[/teal]
[teal]USL[/teal]
[teal]EUK[/teal]
[teal]AUX[/teal]
[teal]EUS[/teal]
[teal]USL[/teal]
[teal]UKL[/teal]
[teal]XXX[/teal]
[tt]------------------------------------------------------------
Pragmas (perl 5.8.8) used :
[ul]
[li]strict - Perl pragma to restrict unsafe constructs[/li]
[/ul]
[/tt]
 
my suggestion

Code:
[blue]%test[/blue] = [red]([/red] [red]"[/red][purple]USL[/purple][red]"[/red] => [fuchsia]10[/fuchsia], [red]"[/red][purple]UKL[/purple][red]"[/red] => [fuchsia]9[/fuchsia], [red]"[/red][purple]EUK[/purple][red]"[/red] => [fuchsia]8[/fuchsia], [red]"[/red][purple]EUS[/purple][red]"[/red] => [fuchsia]7[/fuchsia][red])[/red][red];[/red]

[blue]@string[/blue] = [red]qw([/red][purple]USLaabb UKLccdd EUKeeff EUSgghh[/purple][red])[/red][red];[/red]

[olive][b]for[/b][/olive] [blue]$string[/blue] [red]([/red][blue]@string[/blue][red])[/red] [red]{[/red]
	[blue]$score[/blue] = [blue]$test[/blue][red]{[/red][red]([/red][url=http://perldoc.perl.org/functions/substr.html][black][b]substr[/b][/black][/url][red]([/red][blue]$string[/blue],[fuchsia]0[/fuchsia],[fuchsia]3[/fuchsia][red])[/red][red])[/red][red]}[/red][red];[/red]
	[url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple][blue]$score[/blue][purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
[red]}[/red]
 
Forget mine... I jumped the gun, and was just matching on the string itself.
 
brigmar's has a lot better error catching then mine!! (missing scores and all that)
 
How about a combination of the two!!

Code:
[blue]%test[/blue] = [red]([/red] [red]"[/red][purple]USL[/purple][red]"[/red] => [fuchsia]10[/fuchsia], [red]"[/red][purple]UKL[/purple][red]"[/red] => [fuchsia]9[/fuchsia], [red]"[/red][purple]EUK[/purple][red]"[/red] => [fuchsia]8[/fuchsia], [red]"[/red][purple]EUS[/purple][red]"[/red] => [fuchsia]7[/fuchsia][red])[/red][red];[/red]
[blue]$notfound[/blue] = [red]"[/red][purple]-1[/purple][red]"[/red][red];[/red]

[blue]@string[/blue] = [red]qw([/red][purple]USLaabb UKLccdd EUKeeff EUSgghh test[/purple][red])[/red][red];[/red]

[olive][b]for[/b][/olive] [blue]$string[/blue] [red]([/red][blue]@string[/blue][red])[/red] [red]{[/red]
	[blue]$score[/blue] = [url=http://perldoc.perl.org/functions/exists.html][black][b]exists[/b][/black][/url] [blue]$test[/blue][red]{[/red][red]([/red][url=http://perldoc.perl.org/functions/substr.html][black][b]substr[/b][/black][/url][red]([/red][blue]$string[/blue],[fuchsia]0[/fuchsia],[fuchsia]3[/fuchsia][red])[/red][red])[/red][red]}[/red] ? [blue]$test[/blue][red]{[/red][red]([/red][black][b]substr[/b][/black][red]([/red][blue]$string[/blue],[fuchsia]0[/fuchsia],[fuchsia]3[/fuchsia][red])[/red][red])[/red][red]}[/red] : [blue]$notfound[/blue][red];[/red]
	[url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple][blue]$score[/blue][purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
[red]}[/red]
 
Great stuff, guys, thanks! Just one problem with brigmar's - it's not necessarily the first three characters!
 
Gotcha.
Try this instead; it uses an interpolated regex:
Code:
[blue]%scores[/blue] = [red]([/red] [red]"[/red][purple]USL[/purple][red]"[/red] => [fuchsia]10[/fuchsia], [red]"[/red][purple]UKL[/purple][red]"[/red] => [fuchsia]9[/fuchsia], [red]"[/red][purple]EUK[/purple][red]"[/red] => [fuchsia]8[/fuchsia], [red]"[/red][purple]EUS[/purple][red]"[/red] => [fuchsia]7[/fuchsia], [red]"[/red][purple]CA2UK[/purple][red]"[/red] => [fuchsia]6[/fuchsia] [red])[/red][red];[/red]
[blue]$notfound[/blue] = [red]"[/red][purple]-1[/purple][red]"[/red][red];[/red]
[blue]@string[/blue] = [red]qw([/red][purple]USLaabb UKLccdd EUKeeff EUSgghh CA2UKiijj test[/purple][red])[/red][red];[/red]

[olive][b]for[/b][/olive] [blue]$string[/blue] [red]([/red] [blue]@string[/blue] [red])[/red] [red]{[/red]
	[blue]$found[/blue] = [blue]$notfound[/blue][red];[/red]
	[olive][b]for[/b][/olive] [blue]$key[/blue] [red]([/red] [url=http://perldoc.perl.org/functions/keys.html][black][b]keys[/b][/black][/url] [blue]%scores[/blue] [red])[/red] [red]{[/red]
		[olive][b]if[/b][/olive] [red]([/red] [blue]$string[/blue] =~ [red]/[/red][purple]^[blue]$key[/blue][/purple][red]/[/red] [red])[/red] [red]{[/red]
			[blue]$found[/blue] = [blue]$key[/blue][red];[/red]
			[olive][b]last[/b][/olive][red];[/red]
		[red]}[/red]
	[red]}[/red]
	[blue]$match[/blue] = [blue]$scores[/blue][red]{[/red][blue]$found[/blue][red]}[/red][red];[/red]
	[url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple][blue]$string[/blue] = [blue]$match[/blue][purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
[red]}[/red]
It can be optimized, but I left it this way for legibility.
 
Not enough tea. Try again <slaps self>.
Code:
[blue]%scores[/blue] = [red]([/red] [red]"[/red][purple]USL[/purple][red]"[/red] => [fuchsia]10[/fuchsia], [red]"[/red][purple]UKL[/purple][red]"[/red] => [fuchsia]9[/fuchsia], [red]"[/red][purple]EUK[/purple][red]"[/red] => [fuchsia]8[/fuchsia], [red]"[/red][purple]EUS[/purple][red]"[/red] => [fuchsia]7[/fuchsia], [red]"[/red][purple]CA2UK[/purple][red]"[/red] => [fuchsia]6[/fuchsia] [red])[/red][red];[/red]
[blue]$notfound[/blue] = [red]"[/red][purple]-1[/purple][red]"[/red][red];[/red]
[blue]@string[/blue] = [red]qw([/red][purple]USLaabb UKLccdd EUKeeff EUSgghh CA2UKiijj test[/purple][red])[/red][red];[/red]

[olive][b]for[/b][/olive] [blue]$string[/blue] [red]([/red] [blue]@string[/blue] [red])[/red] [red]{[/red]
	[blue]$match[/blue] = [blue]$notfound[/blue][red];[/red]
	[olive][b]for[/b][/olive] [blue]$key[/blue] [red]([/red] [url=http://perldoc.perl.org/functions/keys.html][black][b]keys[/b][/black][/url] [blue]%scores[/blue] [red])[/red] [red]{[/red]
		[olive][b]if[/b][/olive] [red]([/red] [blue]$string[/blue] =~ [red]/[/red][purple]^[blue]$key[/blue][/purple][red]/[/red] [red])[/red] [red]{[/red]
			[blue]$match[/blue] = [blue]$scores[/blue][red]{[/red][blue]$key[/blue][red]}[/red][red];[/red]
			[olive][b]last[/b][/olive][red];[/red]
		[red]}[/red]
	[red]}[/red]
	[url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple][blue]$string[/blue] = [blue]$match[/blue][purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
[red]}[/red]
 
Why make these guys work in the dark? Post some sample data and explain better what it is you are trying to do.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Kevin,
I'm just posting before thinking. The OP's original question did contain this information, just my addled brain didn't digest it all.
 
Slight variation using an optimized single regular expression

Code:
[url=http://perldoc.perl.org/functions/use.html][black][b]use[/b][/black][/url] [green]strict[/green][red];[/red]

[url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]%scores[/blue] = [red]([/red]
	[purple]USL[/purple]		=> [fuchsia]10[/fuchsia],
	[purple]UKL[/purple]		=> [fuchsia]9[/fuchsia],
	[purple]EUK[/purple]		=> [fuchsia]8[/fuchsia],
	[purple]EUS[/purple]		=> [fuchsia]7[/fuchsia],
	[purple]CA2UK[/purple]	=> [fuchsia]6[/fuchsia],
[red])[/red][red];[/red]
[black][b]my[/b][/black] [blue]$scorekeys[/blue] = [url=http://perldoc.perl.org/functions/join.html][black][b]join[/b][/black][/url] [red]'[/red][purple]|[/purple][red]'[/red], [url=http://perldoc.perl.org/functions/keys.html][black][b]keys[/b][/black][/url] [blue]%scores[/blue][red];[/red]

[black][b]my[/b][/black] [blue]@strings[/blue] = [red]qw([/red][purple]USLaabb UKLccdd EUKeeff EUSgghh CA2UKiijj test[/purple][red])[/red][red];[/red]

[olive][b]for[/b][/olive] [black][b]my[/b][/black] [blue]$string[/blue] [red]([/red] [blue]@strings[/blue] [red])[/red] [red]{[/red]
	[olive][b]if[/b][/olive] [red]([/red][blue]$string[/blue] =~ [red]m/[/red][purple]^([blue]$scorekeys[/blue])[/purple][red]/[/red][red]o[/red][red])[/red] [red]{[/red]
		[url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple][blue]$string[/blue] => [blue]$scores[/blue]{[blue]$1[/blue]}[purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
	[red]}[/red]
[red]}[/red]

[fuchsia]1[/fuchsia][red];[/red]

[teal]__END__[/teal]

Output:

Code:
> perl scratch.pl
USLaabb => 10
UKLccdd => 9
EUKeeff => 8
EUSgghh => 7
CA2UKiijj => 6

- Miller
 
I've just got a funny feeling the OP is going to say the real data is not so simple. But maybe not.



------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
OP here - no, the data really is that simple, so I'm going to go off and try some of these great suggestions. Thanks all!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top