Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Reg Expression assistance?

Status
Not open for further replies.

Guest_imported

New member
Jan 1, 1970
0
I want just the output of the page name here but cant get this to work right.
Some of the pages are index.cfm or index.html or webpage.htm or home.cfm etc..
I just to eliminate the .cfm or .html or .htm at the end when I print.


I can do it with split command but
want to do this with regular expressions.

[tt]
use strict;
my $var="index.html";

$var =~ /(*.+).html?cfm?htm/;
chomp;

print "$1\n";[/tt]
 
For a start;

/(*.+).html?cfm?htm/ would probably be better written as;

/^(\w+)\.(html|cfm|htm)/ # the first * in your RE is ineffectual, and notice the . is escaped so that perl realises it is a real . and not just any character.

secondly the split will work just as well and is a lot simpler anyway;

($word, undef) = split(/\./);
print $word . "\n";

why give yourself an extra headache?
 
This should work.

$var =~ /(.+)(\.)(.+)/;
print "$1\n";

MattMcGinnis
 
$var=~s~\..*~~;

This would be the most effecient so far. Since it doesn't use capturing parens.

--jim
 

Now I have never seen this: $var=~s~\..*~~;

Please explain what the ~s~\..*~~ is doing?
Are these some sort of advanced regular expressions, if so what does the ~ and \..
represent?



use strict;
my $var = "myfile.html";

$var=~s~\..*~~;

print "$var\n";
 
Hi Xyrush,

The regex itself is not really impressive, but the common convention of using forward-slashes (/) as the regex delimiters was not followed in my example. The regex also could have been written like this:
Code:
$var =~ s/\..*//;
Notice that the first tilde (~) in the statement is actually part of the binding operator, and not the regex.
The matching part of this regex:
Code:
\..*
Says:
Match the first occurance of a period, and everything after it. For Concard's purposes, I thought this was the most straightforward solution for his problem.

As far as why did I use the tilde's to delimit the regex? Becuase it's funner looking. I wouldn't recommend it in professional code though, becuase it's kinda confusing.

--jim
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top