Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regexp help -- matching filenames with different versions

Status
Not open for further replies.

RottPaws

Programmer
Mar 1, 2002
478
US
I'm trying to loop through a group of files for processing, but some files may have different versions in the batch and I only need to process the latest version of each one.
I've got an array of filenames (reverse sort):
@stl_files = [
'STL-20120925.01.006_IPSP.CSV',
'STL-20120925.01.006.CSV',
'STL-20120925.01.001_IPSP.CSV',
'STL-20120925.01.001.CSV',
'STL-20120914.01.006.CSV',
'STL-20120914.01.001.CSV'
]

The '001' and '006' in the examples are the version numbers. So as I loop through these files, I need to match against previously checked processed files.

So when the 3rd file comes up, it needs to recognize that it's an earlier version of the 1st file, the 4th is an earler version of the 2nd, and the 6th is an earlier version of the 5th.

I'm thinking as each file is looked at, I'll add it to an array of viewed files. Then there should be some way to use a regexp to to bump the current filename against the viewed filenames, but I'm at a loss of how to do it effeciently. Can someone point me in the right direction?

Thanks,

_________
RottPaws

If you don't report a problem, I probably won't fix it.
 
Try this:

Perl:
[COLOR=#006600]#!/usr/bin/perl -w[/color]
[COLOR=#0000FF]use[/color] strict;

[COLOR=#0000FF]my[/color] %seen;

[COLOR=#0000FF]my[/color] @stl_files = (
        [COLOR=#808080]'STL-20120925.01.006_IPSP.CSV'[/color],
        [COLOR=#808080]'STL-20120925.01.006.CSV'[/color],
        [COLOR=#808080]'STL-20120925.01.001_IPSP.CSV'[/color],
        [COLOR=#808080]'STL-20120925.01.001.CSV'[/color],
        [COLOR=#808080]'STL-20120914.01.006.CSV'[/color],
        [COLOR=#808080]'STL-20120914.01.001.CSV'[/color]
);

[COLOR=#0000FF]foreach[/color] [COLOR=#0000FF]my[/color] $f (@stl_files) {
        [COLOR=#006600]# copy the filename[/color]
        [COLOR=#0000FF]my[/color] $[COLOR=#FF8000]s[/color] = $f;
        [COLOR=#006600]# strip out the version number (dd.ddd following an 8-digit date)[/color]
        $[COLOR=#FF8000]s[/color] =~ [COLOR=#FF8000]s[/color]/([[COLOR=#FF0000]0[/color]-[COLOR=#FF0000]9[/color]]{[COLOR=#FF0000]8[/color]}\.)[[COLOR=#FF0000]0[/color]-[COLOR=#FF0000]9[/color]][[COLOR=#FF0000]0[/color]-[COLOR=#FF0000]9[/color]]\.[[COLOR=#FF0000]0[/color]-[COLOR=#FF0000]9[/color]]{[COLOR=#FF0000]3[/color]}/$[COLOR=#FF0000]1[/color]/;
        [COLOR=#0000FF]if[/color] ([COLOR=#FF0000]defined[/color] $seen{$[COLOR=#FF8000]s[/color]}) {
                [COLOR=#FF0000]print[/color] [COLOR=#808080]"already processed a version of $f\n"[/color];
        } [COLOR=#0000FF]else[/color] {
                [COLOR=#FF0000]print[/color] [COLOR=#808080]"processing $f\n"[/color];
                [COLOR=#006600]#[/color]
                [COLOR=#006600]# insert processing code here[/color]
                [COLOR=#006600]#[/color]
                $seen{$[COLOR=#FF8000]s[/color]}=[COLOR=#FF0000]1[/color];
        }
}

Annihilannic
[small]tgmlify - code syntax highlighting for your tek-tips posts[/small]
 
I think the most efficient way is to build a hash where the key is the filename with the revision number replaced by some constant (or null) value and the value is the revision number. Note also that if the filenames have all the same structure, then it is more efficient to use string manipulation instead of regexp's. At the end you transfer the restored filenames into an array.
Code:
@stl_files=(
  'STL-20120925.01.006_IPSP.CSV',
  'STL-20120925.01.006.CSV',
  'STL-20120925.01.001_IPSP.CSV',
  'STL-20120925.01.001.CSV',
  'STL-20120914.01.006.CSV',
  'STL-20120914.01.001.CSV'
);
for(@stl_files){
  $rev=substr($_,16,3,'');
  if(exists$seen{$_}){
    $seen{$_}=$rev if $rev gt $seen{$_}; 
  }else{
    $seen{$_}=$rev;
  }
}
for(keys %seen){
  substr($_,16,0,$seen{$_});
  push @recent_files,$_;
}
print"@recent_files\n";

Franco
: Online engineering calculations
: Magnetic brakes for fun rides
: Air bearing pads
 
Excellent! Thank you so much!!!

_________
RottPaws

If you don't report a problem, I probably won't fix it.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top