Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SORT and CGI

Status
Not open for further replies.

samesale

Programmer
Sep 15, 2003
133
US
The following supposed to sort the file by model. However, it does not do that. May be I do not understand the logic. Please help.

#!/usr/bin/perl

print "Content-type:text/html\n\n";

use CGI;
$query = new CGI;

open(INF, &quot;<car1.out&quot;) or dienice(&quot;Couldn't open auto.out for reading: $! \n&quot;);
@kites = <INF>;
close (INF);


foreach $i (sort bycity @kites) {
chomp ($i);
($name,$add,$city,$state,$zip,$make,$photo,$model,$color,$mile,
$eyear,$price,$email,$tele,$time,$comment1,$month,$day,$year)=split(/\|/, $i);
print &quot; $city - \$$model<br>\n&quot;;

# this is the same as $count = $count + 1;


}


sub bycity {
@a = split (/\|/,$a);
@b = split(/\|/,$b);
$a[7] <=> $b[7];

}

print <<EndHTML;
EndHTML
sub dienice {
my($msg) = @_;
print &quot;<h2>Error</h2>\n&quot;;
print $msg;
exit;
}
 
If the sort field, city, is a character string and not numeric, you have to use the [tt]cmp[/tt] operator instead of [tt]<=>[/tt] which sorts numbers.

Also, if you have a huge amount of data to sort, or are sorting very often, there are much more efficient ways to do it. They're not so nice to look at, but they will run significantly faster if you give them a lot of data. Most of the time, short scripts aren't hindered by execution time, though, and it really doesn't make much difference.

----------------------------------------------------------------------------------
...but I'm just a C man trying to see the light
 
I am using the sort program to sort a field. Is it possible to sort two fields in one run or do I need to sort the file again. Could be done in the same program?

sub byitem {
@a = split (/\|/,$a);
@b = split(/\|/,$b);
$a[9] <=> $b[9];

}
 
Sorting on multiple keys is a short modification of what you have already. End the sub with an [tt]||[/tt] of the two fields you're using, in the order you want the keys sorted in.
Code:
$a[9] <=> $b[9] || $a[7] <=> $b[7];
The [tt]<=>[/tt] and [tt]cmp[/tt] operators return -1,0,1 for <,=,>. The operator returns 0 if they are equal and need to be sorted by a sub-key. The [tt]||[/tt] does not short circuit and the second test runs, returning a -1,0,1 based on the test of the second sort key.

Also, if you're interrested in one of the fastest ways to do it:
Code:
my $i = 0;
@sorted = map { @kites[substr($_,8)] }
          sort
          map { pack('N2',(split /\|/)[9,7]).$i++ }
          @kites;
If you're going to be working the data as fields very often, it'd be worth your time to put it into some kind of data structure or use one of the delimited records modules on cpan.

----------------------------------------------------------------------------------
...but I'm just a C man trying to see the light
 
Your suggested changes worked, icrf. However, I would like to try the fastest method, but I do not understand the logic behined. Is there a source I can look at. Thanks for your great help.
 
Here's a paper on sorting that ends with that method: That's where I first heard of it (thanks Barbie!).

The trick is you have to start reading it from the end. @kites is the source that gets mapped into something else to sort. First the element in @kites is split into fields and 9 and 7 are taken out. These two are packed as big-endian long integers for sorting (four bytes each; endian-ness probably doesn't matter much, could likely just use 'L2' and just make them default long integers). Appended on the end is $i++, which is just the index for that array element. $i is initialized to 0 at the start, and $i++ returns its current value then increments. So, for the first element, 0 is returned and then $i is set to 1. The next time through 1 is returned and it's set to 2, etc, coinciding with the array index of the current element of @kites.

Now sorting is done with the lexical default (no custom sort sub), which is optimized in C and very fast. To get the information you want back out, you just need to get the index of each element back, which was tacked on the end. Since each of the two sort keys were packed as four bytes, just substr 8 bytes in and you have the index to rebuild the original data in the sorted order.

It's kind of a thick idea to wrap your head around. Read through the article and see if it helps. It's not all that different from the old Schwartzian Transform:
________________________________________
Andrew - Perl Monkey
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top