Problem with division 2

brianwustl · Feb 14, 2005

I would like to open a text file that contains hundreds of lines of tab delimited numerical values (hundreds of values per line) between 0 and 255. I would like to take the mean average of all these values (so I guess the program would add up all the values and then divide them by the number of values per file). So the result for each text file would be a single value between 0 and 255.

I would like to perform this on a series of files and then create a single text file that holds all of these mean values, each on a new line:

i.e.,
230
198
194
6
201
57
115
...etc.

Also, this would need to be run with the perl included with Mac OS X.

Here is what I have so far:

Code:

my @files;
my (@l_count, $l_count);
my (@f_count, $f_count);
my @all;

grep{ -f and push @files, $_ }glob '*';

for( @files ){
    open FH, $_ or die $!;
    while( <FH> ){
        push @l_count, $_ for split '\s+', $_;
        for( @l_count ){
           $l_count += $_ for @l_count
        }
        @l_count = ();
        push @f_count, ( $l_count / $#l_count );
        $l_count = 0;
    }
    close FH;
    $f_count += $_ for @f_count;
    @f_count = ();
    push @all, ( $f_count / $#f_count );
    $f_count = 0;
}

open FH, '>output/end_res.log' or die $!;
print FH $_, $/ for @all;
close FH;

It seems to work perfectly, except for the fact that the resulting mean value seems to be (sum * count) instead of (sum / count). I tried fiddling with the code to get it to output the ratio instead of the product, but to no avail.

For example, a file with the following comma delimited values:

1 1 1

outputs:

9

(1+1+1 * 3)

Any ideas?

Thanks,
Brian

KevinADC · Feb 14, 2005

I don't know anything about Perl for Macs, but this should work for Perl on nix and windows, maybe it will for Mac or will with some modification:

Code:

#!perl
use strict;
use warnings;

my @all = ();

my @files = grep{ -f $_ }glob '*.num';

foreach ( @files ){
    my @f_count = ();
    my ($l_count,$f_count) = (0,0);

    open FH, $_ or die $!;
    my $line = do { local $/; <FH> }; # read entire file into one string
    close FH;

    my @l_count = split(/\s+/,$line);
    next unless @l_count; # skip blank files if any 
    $l_count += $_ for @l_count;
    push @f_count, ( sprintf '%.2f', ($l_count / int(@l_count)) );# round to 2 decimal places

    $f_count += $_ for @f_count;
    push @all, ( sprintf '%.2f', ($f_count / int(@f_count)) );
}

open FH, '>end_res.log' or die $!;
print FH "$_\n" for @all;
close FH;

KevinADC · Feb 14, 2005

edit, change this line:

my @files = grep{ -f $_ }glob '*.num';

to:

my @files = grep{ -f $_ }glob '*';

I was using *.num to do a quick test on my computer.

mikevh · Feb 14, 2005

I also don't know anything about Perl on Macs. I'm no mathematician. (What is a mean average? Aren't mean and average synonymous?)

Code:

grep{ -f and push @files, $_ }glob '*';

An unusual idiom. How about
my @files = grep {-f} <*>, or
while (<*>) {
next unless -f;
...

Code:

for( @l_count ) {
    $l_count += $_ for @l_count
}

I think you are looping @l_count * @l_count times here. Either get rid of the outer for loop or the for modifier. You only need one.

Code:

push @f_count, ( $l_count / $#l_count );

I think you are trying to average the values on the current line here. Your description of the program's objective didn't mention this, I think. However, $#l_count is not the number of items in @l_count. It's the highest index in @l_count, which is one less than the number of items in @l_count, since array indexing starts at 0. Using an array name in scalar context returns the number of elements in the array, so the number of elements in @l_count is @l_count. Also, why do you need parentheses around the division? Better to use them around the function arguments.

Code:

push @all, ( $f_count / $#f_count );

You have the same problem here with confusing the highest index ($#f_count) with the number of elements (@f_count).

That's the best I can do without a better understanding of what you're trying to accomplish.

HTH

KevinADC · Feb 15, 2005

(What is a mean average? Aren't mean and average synonymous?)

That would be the average of all the averages, the mean average, which is about as average as you could possibly get unless there is a mean average of the mean averages! ;-)

mikevh · Feb 15, 2005

Man, that's one mean average! [bigglasses]

KevinADC · Feb 15, 2005

Man, that's one mean average!

if ($Saddam_Hussien) {
print "The Mother of All Averages!";
}

[ponder]

(-:

PaulTEG · Feb 15, 2005

perl on OS/X IIRC is the same as that found on FreeBSD

type

Code:

which perl[/perl]
and put that in your shebang line
#!/usr/bin/perl

--Paul

cigless ...

waiterm · Feb 16, 2005

I always thought the mean average is each item added up and DIVIDED by the total number of items...i.e. ((1 + 1 + 1)/3) which would give you 1...

If I'm assuming correctly and all your files are in the same directory I'd do the following:

Code:

$dir = 'c:/tmp/folder/files/';
opendir(DIR, $dir);
@files = readdir(DIR);
closedir(DIR);

$output = 'c:/tmp/folder/output.txt';
open (OUTPUT, ">>".$output) || die "ERROR: Unable to open output file: ".$output;
foreach $file(@files)  {
   $total = 0;
   $element_count = 0;
   open (FILE,$file) || die "ERROR: Can't open file: ".$file;
   @lines = <FILE>;
   close FILE;
   foreach $line(@lines)  {
      @elements = split/'/,$line;
      foreach $element(@elements)  {
         $total = $total + $element;
         $element_count++;
      }
   }
   $mean_average = $total / $element_count;
   print OUTPUT $mean_average."\n";
}
close OUTPUT;

please bare in mind this isn't tested, so you may have to make some adjustments...

Rob Waite

http://www.hostpipe.co.uk?src=TT

mikevh · Feb 16, 2005

That looks like what the OP intended from his description (save that you're not skipping directories), but I was confused by the term mean average and the code which seemed to be averaging the values on each line.

As he's declined to enlighten us further about his intentions, who knows what it all really means? (Yuk, yuk.)

PaulTEG · Feb 16, 2005

oh that's just mean ;-), hnaar, hnaar
--Paul

cigless ...

rharsh · Feb 16, 2005

Math time! A bit on averages (using the dataset {1 2 2 5 9 11 15}):

Mean Average is the normal average you're used to. Add all the elements up and divide by the number of elements. So, with the numbers above, the mean average is roughly 6.4.

Median Average is the 'middle' of your dataset. With an even number of elements, you split the difference between the middle two. So, with the numbers given above, the median average 5.

Mode Average is the most common element in the dataset. I don't remember what you're supposed to do in the event of a tie. But, in the dataset above, the mode averages is 2.

mikevh · Feb 16, 2005

Hey, I didn't mean no harm ...

mikevh · Feb 16, 2005

Hmm, rharsh, twarn't the way I learned it back when I was a young'un. We called the first one a mean or an average (synonyms), the second a median plain and simple, and the third a mode, likewise.

'Course I'm from the States so maybe it's different over in some a them furren places, and I'm gettin' a wee bit long in the tooth, so maybe they teach young'uns different names for them things nowadays.

And then again, my math never was no better'n just average.

PaulTEG · Feb 16, 2005

The young'uns indeed are learning differently. I heard one kid say that the probabilty of the circumference of a circle being [pi] x radius was one

--Scary

cigless ...

mikevh · Feb 16, 2005

Don't get me started. I'll have everybody thinking I'm a mean ol' man with outmoded ideas.
[soapbox]

stevexff · Feb 17, 2005

Not used a Mac, but I've read that you can save a perl script as a droplet (whatever that is) and execute it by dragging and dropping file(s) onto it. In the absence of a command line, I guess this invokes the script with the file names as command line arguments?

In which case,

Code:

use strict;
use warnings;

my @averages;

while (<>) {
    my ($n, $total);
    while (/(\d+)\s*/g) {
        $total += $1;
        $n++;
    }
    if (eof) {
        my $mean = $n ? $total / $n : 0; # avoid possible divide-by-zero
        push @averages, $mean;
        $total = $n = 0;
    }
}
print join("\n", @averages), "\n";

ought to do it.

Anyway Mike, I wouldn't call you mean or outmoded. You just suffer from the same set of standard deviations as the rest of us...

stevexff · Feb 17, 2005

Er, no, it only averages the last line of each file. That'll teach me to generate random test data from another script [blush]

stevexff · Feb 17, 2005

Moved $n and $total outside the while loop to fix it

Code:

use strict;
use warnings;

my @averages;
my ($n, $total);

while (<>) {
    while (/(\d+)\s*/g) {
        $total += $1;
        $n++;
    }
    if (eof) {
        my $mean = $n ? $total / $n : 0; # avoid possible divide-by-zero
        push @averages, $mean;
        $total = $n = 0;
    }
}
print join("\n", @averages), "\n";

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Problem with division 2

Programmer

Technical User

Technical User

Programmer

Technical User

Programmer

Technical User

Technical User

Programmer

Programmer

Technical User

Technical User

Programmer

Programmer

Technical User

Programmer

Programmer

Programmer

Programmer

Similar threads

Log in

Part and Inventory Search

Sponsor