Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations John Tel on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Simple Regular Expressions - split 2

Status
Not open for further replies.

topub

Programmer
Jun 6, 2006
42
US
Hi
Very silly question.... but I can't seem to get it to working. It works for other examples, but just not for the one I want.

Say, three variables:
Code:
$file_name1 = "test.txt.pgp";
$file_name2 = "test.txt";
$file_name3 = "test";

I want to timestamp all three files, to look like:
Code:
$file_name1 = "test.txt_mmddyyyy_mmss.pgp";
$file_name2 = "test_mmddyyyy_mmss.txt";
$file_name3 = "test_mmddyyyy_mmss";

I have a script that gives me "mmddyyyy_mmss", but I can't seem to break the file name variable at "." [period] starting from the end of the string.
I tried:
Code:
@fileXt1 = split /\.$/, $file_name1, 2;
@fileXt2 = split /\.$/, $file_name2, 2;
@fileXt3 = split /\.$/, $file_name3, 2;

So once I break those file_name variables, I could join them back with timestamp.

any ideas....
 
split /\.$/, $file_name1, 2;

the above will split on the . that is at the end of the $file_name1 string...that's quite impossible - the delimiter can't be the last thing in a string :)

try it without the $, like this

split /\./, $file_name1, 2;

On that note, if this format is fixed, the faster thing would be to just substitute

$file_name1 =~ s/\.(.*)\./\.$1_mmddyyyy_mmss\./
 
your split is looking for a dot at the end of the string, which would only happen if you had a file say
Code:
test.
which is a long shot to say the least ;-)

Code:
my $newfile=timestamp( $file_name1);

sub timestamp {
  my ($filename)=@_;
  my @file=split /\./, $filename;
  my $ext=pop(@file);
  my $newfile=join ".", @file .".".$timestamp."."$ext;
  return $newfile;
}
Haven't tested it though ;-)

Paul
------------------------------------
Spend an hour a week on CPAN, helps cure all known programming ailments ;-)
 
thanks a lot guys...
I'll use both your input into my code...

But just a thought:
Looking at your replies, can I make an assumption that, in Perl we cannot split a string on a given delimiter starting from the end, and break it into 2 pieces.

thanks again,
UB
 
if you split "test." on ".", you'll have a 1 element array containing "test", I think you're looking to start the split at then end, a regex would be best for this

Paul
------------------------------------
Spend an hour a week on CPAN, helps cure all known programming ailments ;-)
 
thanks Paul (as always..)

input:test.txt.pgp
output test.txt & pgp

I'll try regex...see if I can figure one out
 
update:
Paul, with slight modification, your pop and join logic works great.

thanks again,
UB
 
TIMTOWDTI:
Code:
my $file_name1 = "test.txt.pgp";
my $file_name2 = "test.txt";
my $file_name3 = "test";

my $timestamp = '_mmddyyyy_mmss';

for ( $file_name1, $file_name2, $file_name3 ) {
   s/(?=\.[^.]+$)|$/$timestamp/;
}

print "$file_name1\n$file_name2\n$file_name3\n";
 
What are the special variables in this line:

s/(?=\.[^.]+$)|$/$timestamp/;

Thanks
 
thanks ishnid... thats hard code Perl, I guesss....
but it might take a while to figure out what this does...

s/(?=\.[^.]+$)|$/$timestamp/;


thanks everyone,
 
I'll leave the voodoo to ishnid ;-), I'm looking forward to that one myself ...

But a word if I may
Code:
my $timestamp = '_mmddyyyy_mmss';
might be better written to group relevant files together
Code:
my $timestamp = '_[COLOR=green]yyyymmdd[/color]_[COLOR=red]hh[/color]mmss';
Also I'd consider putting the timestamp at the start of the filename, but that's just me ... then again that information would be available by calling stat on the file (assuming it hasn't been touched)

Paul
------------------------------------
Spend an hour a week on CPAN, helps cure all known programming ailments ;-)
 
Oops. I'd forgotten about this thread.

Right, the regexp.

This bit's ok, right: \.[^.]+$
A dot, followed by any number of characters that aren't dots followed by the end of the string. In other words, this pattern identifies the file extension.

So far, so good?

Next, we come to the (?=...)
That's known as a lookahead assertion (you can find more info on that in perlretut and perlre). Rather than matching what's in the parentheses, it tries to match any position in the string that is followed by that pattern. The difference between this and leaving out the lookahead is that you don't have to mess around with $1, $2 variables in the replacement, since you're not actually matching the file extension, just the position immediately before it. Because this is a "zero-width assertion", it doesn't match any characters and is effectively matching the point between two characters (in this case the point between the dot on the file extension and the previous character). If this is found, it is replaced with the filestamp.

Finally, there's the |$ bit. That's for the third example where there's no file extension. It says "or the end of the string". So where there is a file extension, the first part of the regexp will find it and where there isn't, the timestamp is inserted at the end of the string.

Hope that helps clear it up a bit. Lookahead and lookbehind assertions are tricky to get your head around initially but they're very, very useful when you do.
 
amazing and thanks ishnid. Looks like assertions ARE useful.
thanks everyone for sharing your thoughts....

enjoy coding Perl,
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top