Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Sorting/comparing like filenames 3

Status
Not open for further replies.

Roy777

Technical User
Sep 3, 2006
3
CA
Help (please)...
I am a very green newbie with perl but here is what I am trying to do.
I have a directory with many many files of the format;
xxxxxxxx-01.ext
xxxxxxxx-03.ext
yyyyyyyy-23.ext
yyyyyyyy-24.ext
aaaaaaaa-01.ext
aaaaaaaa-09.ext
....
What I would like to do is have a script that loops through the filenames keeping only the higher(newer)
versions of each file.
based on the examples above , only the xxxxxxx-03 and the yyyyyy-24 and the aaaaaaa-09 files would remain after running the script.

( yes I have been RTFM'ing and surfing and perldoc'ing , just too green yet to put things together , so please be kind)
 
Code:
my $files = {}; # a hashref to hold the filenames in

opendir (DIR, "./directory");
foreach my $file (readdir(DIR)) {
   next if $file =~ /^\./; # skip . and ..

   # get the "aaaaaaaaa" part from "aaaaaaaa-09.ext" or w/e
   my ($like) = $file =~ /^(.*?)\-\d+\.ext/;

   # if this is the first file we've seen like this...
   if (not exists $files->{$like}) {
      # create its arrayref
      $files->{$like} = [ $file ];
   }
   else {
      # otherwise, add this to the arrayref
      push (@{$files->{$like}}, $file);
   }
}
closedir (DIR);

And then $files would be a hashref containing all the files by their "like" name, and each one of those would be the actual filename it found.

So $files->{aaaaaaaa}->[0] might be "aaaaaaa-01.ext" and $files->{aaaaaaa}->[1] might be "aaaaaaa-02.ext", and they should already be sorted for you but if not you can just sort them yourself.

If you want to access the arrays or anything, just put an @{} around the location of the array, i.e.

Code:
foreach my $aaaaaa (@{ $files->{aaaaaa} }) {
   print "$aaaaaa\n";

   # so this might print:
   # aaaaaa-01.ext
   # aaaaaa-02.ext
   # aaaaaa-03.ext
}

# or to see every file "like" name
foreach my $like (keys %{ $files }) {
   print "$files->{$like}\n";

   # so this loop might print:
   # aaaaaaaa
   # xxxxxxxx
   # yyyyyyyy

   # etc
}

It might be a bit of a complex example for a beginner, but it's one of the more efficient ways to do this.
 
Thanks Kirsle for giving me a foundation to start from.
I will test your examples and hopefully build on this.
My goal is to remove (unlink) the old versions of each file from the directory.
Back to my "perl for dummies" books....
 
In the days of VMS, purge/keep=2 was the king ;-)

If it's just the newest version of the file you want, you could capture the digits, and if it's greater assign the digit value to the hashref.

it's always going to be a two pass operation to unlink the files you don't need, because you have to see what's there first, based on Kirsle's fine example
Code:
my $files = {}; # a hashref to hold the filenames in

while (<DATA>) {
   next if $_ =~ /^\./; # skip . and ..

   # get the "aaaaaaaaa" and "09" part from "aaaaaaaa-09.ext" or w/e
   my ($like, $num) = ($_ =~ /^(.*?)\-(\d+)\.ext/);

   # if this is the first file we've seen like this...
   if (not exists $files->{$like}) {
      # create its hash element
      $files->{$like} = $num;
   }
   else {
      # if the new number is greater replace it
      if ($files->{$like} < $num) {
          $files->{$like} = $num;
      }
   }
}

foreach my $key (keys %{$files}) {
  push (@list_of_files_to_keep, "$key-$files->{$key}.ext");
}

__DATA__
.
..
xxxxxxxx-01.ext
xxxxxxxx-03.ext
yyyyyyyy-23.ext
yyyyyyyy-24.ext
aaaaaaaa-01.ext
aaaaaaaa-09.ext

Paul
------------------------------------
Spend an hour a week on CPAN, helps cure all known programming ailments ;-)
 
Whaddya mean in the days...I wouldn't be surprised to see VMS make a comeback. Throw a decent GUI at it, it would rival Linux. :)

I just retired an Alpha Server 400 (192 MB RAM woo-hoo) two months ago. The vendor decided that Windows was the better choice for their upgrade. Well, needless to say...we've rebooted the new boxes (yes that's box-es) about 1-2 times per week. We haven't increased the number of users any either. And it's slower.

I don't recall having to reboot the Alpha but maybe 2 or 3 times (unscheduled) in it's 10+ year stay.

IMHO, VMS OS/400, AIX, and Linux (not familiar with HPUX or some others) are the only OSs mission critical apps should run on. If you have a problem, fix the app...the OS is fine.

(Stepping down from the soapbox).

BTW. I have a similar script, but it's not as pretty as the ones listed. Thanks.

Mark
 
If all you need is the newsest file, you can remove the old files with a single pass over the file names.
Code:
my %files;
chdir 'c:/test1/';

for (<*.ext>) {
    (my($name, $version, $extension)) = /^(.+)-(\d+)\.(.+)/;
    if (defined $files{$name}) {
        if ($version < $files{$name}) {
            unlink $_;
        } else {
            unlink "$name-$files{$name}.$extension";
            $files{$name} = $version;
        }
    } else {
        $files{$name} = $version;
    }
}
 
Guys
I just wanted to say thanks.
As a very green perl person, your examples have helped me very much. I am still very green, but a convert for sure.
This forum and you help is a godsend, I promise not to abuse the support.
Again thanks a bunch... a round of stars for you all...
 
IMHO, VMS OS/400, AIX, and Linux (not familiar with HPUX or some others) are the only OSs mission critical apps should run on. If you have a problem, fix the app...the OS is fine."

I'm sure you can add HPUX and Solaris to your list of highly stable operating environments, along with dying/dead products like OS/2, AOS/VS (Data General), and GEMCOS/MCS (Burroughs/Unisys).

-------------------------
The trouble with doing something right the first time is that nobody appreciates how difficult it was - Steven Wright
 
How about this?

Code:
[b]#!/usr/bin/perl[/b]

s/^([^-]+)-(\d+)/$mostRecent{$1} = $2 if $2 > $mostRecent{$1}/e while <DATA>;

print "$key => $value\n" while (($key, $value) = each %mostRecent);

[blue]__DATA__
xxxxxxxx-01.ext
xxxxxxxx-03.ext
yyyyyyyy-23.ext
yyyyyyyy-24.ext
aaaaaaaa-01.ext
aaaaaaaa-09.ext
yyyyyyyy-04.ext
xxxxxxxx-02.ext
aaaaaaaa-08.ext[/blue]

Kind Regards
Duncan
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top