Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How do I customize Perl's sort function?

Sorting

How do I customize Perl's sort function?

by  Geekatron  Posted    (Edited  )
Managing lists of things is something Perl is exceptionally good at. As such, it comes with a predefined "sort" function that can be called anywhere, on any kind of list, and can even be reversed using "reverse". (This is covered in any basic Perl reference so I won't go into their uses here.)

However, what if the kind of sorting you need to do is more complex than simply ordering a list numerically or alphabetically? No worries! Perl comes with a built-in framework by which you can painlessly create custom sort routines and use them just like you would use the basic "sort" function. Here's how to make use of this often-overlooked but powerful feature of Perl.


BUILDING A CUSTOM SORT

IMPORTANT: Custom sort subroutines are not normal subroutines! They have some special rules:

(1) Sort subroutines do not accept arguments like normal
subroutines.

(2) Two values being compared are automatically
passed-by-reference into the sort subroutine as
$a and $b. Don't try to change their values or
override them.

(3) A sort subroutine can only return one of three
values: -1 , 0, or 1.

(4) Don't call sort routines recursively. The compiler
will be angry with you.


More on these later, but I want you to keep these in mind while you look at the examples below.


So, lets look at an example. Suppose I have the following hash:
[color blue]
%hashfoo = (
'eep' , 23,
'eek' , 4,
'ack' , 17
);
[/color]
If I wanted to print my hash sorted by keys, i'd be home free:
[color blue]
foreach $key (sort keys %hashfoo) {
print "$key => $hashfoo{$key}\n";
}
[/color]
But alas, I am a high-maintenance programmer; i want to do it MY way, and sort it by value instead. Here's how I can do that with a subroutine.

As I mentioned, I don't have to pass anything into my sort subroutine - when I call it as a sort, it will automatically get two values passed in to be compared, $a and $b. I don't have to get them out of @_, or do a secret voodoo dance, or anything - they just appear, *POOF*, like magic.

In the example above, sort is getting sent hash keys. So all I need to do is compare the corresponding hash VALUES, and return a signal (-1, 0, or 1) indicating which value is bigger.
[color blue]
sub my_sort {
# got hash keys $a and $b automatically
return -1 if $hashfoo{$a} < $hashfoo{$b};
return 0 if $hashfoo{$a} == $hashfoo{$b};
return 1 if $hashfoo{$a} > $hashfoo{$b};
}
[/color]
This will organize my keys neatly, by value, from smallest to largest. However, I've spelled out the last three lines
of the sub so you can see exactly what is happening. Now I want to compact it a bit:
[color blue]
sub my_sort {
# got hash keys $a and $b automatically
return $hashfoo{$a} <=> $hashfoo{$b};
}
[/color]
The <=> operator automatically does for you what the three return lines in the first sub example did, only it's smaller. Small is good!


Ok, Now I have my subroutine. How do I use it?

The "sort" function in Perl can take an extra argument - the name of a subroutine (which we just wrote) to use for
sorting:

sort <subroutine> <list of things to sort>

So now my loop to print them out looks like this:
[color blue]
foreach $key (sort my_sort keys %hashfoo) {
print "$key => $hashfoo{$key}\n";
}
[/color]
So, on the first line, "keys %hashfoo" pulls out all of the keys from my hash and returns them as a list. Sort then feeds them to the 'my_sort' sub, one pair at a time, and orders each pair until it has sorted the whole list. Foreach then loops over the sorted keys and prints them out. Woohoo!

Here's the whole script so far:

[color blue]%hashfoo = (
'eep' , 23,
'eek' , 4,
'ack' , 17
);

foreach $key (sort my_sort keys %hashfoo) {
print "$key => $hashfoo{$key}\n";
}

exit;

# custom sort by by hash value
sub my_sort {
# got hash keys $a and $b automatically
return $hashfoo{$a} <=> $hashfoo{$b};
}[/color]


Now, suppose I am even wierder, and want to do things to those values before I compare them. Easy, I'll just add code to do that in my sub.
[color blue]
sub my_sort {
# got hash keys $a and $b automatically

# do math silliness to the values
my $a_val = 100 - $hashfoo{$a};
my $b_val = 100 - $hashfoo{$b};

# compare the modified values instead of the originals
return $a_val <=> $b_val;
}
[/color]
But wait! I want the order reversed from what the <=> gives me. Do I have to split that back up into three statements and switch the < and > rules?

No. I'll just use Perl's handy "reverse" function which reverses a list for me. It works the same with a custom sort as with a normal one:
[color blue]
foreach $key (reverse sort my_sort keys %hashfoo) {
print "$key => $hashfoo{$key}\n";
}
[/color]
One final thing: suppose I want to compare two strings instead of two numbers in my subroutine? No problem - just replace the '<=>' operator with 'cmp'.

That's all there is to it, really. What you do to sort the values in the privacy of your own sort subroutine is your business; nobody else need ever know.




EXAMPLE: SORTING FILES BY LAST MODIFIED DATE

Here's a more useful example of using a custom sort subroutine to print a list of files sorted by last modified date.

[color blue]
#!/usr/bin/perl

# get list of text files in current folder (unix filesystem)
@files = `ls -1 *.txt`;


foreach $file (sort by_last_mod @files) {
print "$file\n";
}

exit;

# custom sort routine
sub by_last_mod {
# vars $a and $b automatically passed in

# perl function 'stat' returns array of info on a file
# 10th element of the stat array is last modified date,
# returned as number of seconds since 1/1/1970.
my $adate = (stat($a))[9]; # get last modified date
my $bdate = (stat($b))[9]; # get last modified date

return $adate <=> $bdate;
}
[/color]

Hope this is useful!

--G
Register to rate this FAQ  : BAD 1 2 3 4 5 6 7 8 9 10 GOOD
Please Note: 1 is Bad, 10 is Good :-)

Part and Inventory Search

Back
Top