Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

average of duplicates 1

Status
Not open for further replies.

jdhahbi

Technical User
Oct 7, 2009
24
US
Hi
my inFile has 3 fields with duplicates in field1.
I would like to print the average field2 and field3 for the duplicated fild1.
Code:
$cat inFile
field1	field2	field3
A	7	2
B	4	2
B	2	3
C	6	5
D	15	2
D	5	3
D	10	4

Code:
$cat outFile
field1	field2	field3
A	7	2
B	3	2.5
C	6	5
D	10	3
thank you for your help.
 
I tested the code with the following inFile:

#f1 f2 f2
a 3991037 4155442
a 3993760 4160837
a 3994154 4159990
b 308568 179762
f 3484774 3488370
f 3600005 3666058

outFile:
#f1 f2 f2
a 3992980 4158760
b 308568 179762
f 3542390 3577214


I noticed that the average for a should be 3992984 and not 3992980 in field2.
The same for field3, it should be 4158756 and not 4158760
Why is that?
Thanks
 
And this ?
awk 'NR==1{print;next}{n[$1]++;s[$1]+=$2;t[$1]+=$3}END{for(i in n)printf "%s %9.1f %9.1f\n",i,s/n,t/n}'

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Thank you for your help.
It works fine now, it gives the average with one decimal.
One more request: how can I get the average with no decimals?
Again, I really appreciate your help.
Joseph
 
Have a guess:

%9.1f is a conversion string that instructs awk to prints a real number with 1 digit after the decimal point.

How many digits do you want after the decimal point? Zero? Hmmm... ;-)

There's always the man page...

HTH,

p5wizard
 

to get 0 decimals, the code should be:

awk 'NR==1{print;next}{n[$1]++;s[$1]+=$2;t[$1]+=$3}END{for(i in n)printf "%s %9.0f %9.0f\n",i,s/n,t/n}'

thanks p5wizard
 
Info: the number before the decimal point in a conversion string like %9.1f is the total width

so:
[tt]
if the value is 1234567
%9.1f gives "1234567.0" 7 digits before the decimal point,
the decimal point
1 digit after the decimal point
total width is 9 characters
%9.0f gives " 1234567" 2 leading spaces, 7 digits,
no decimal point, width is still 9 characters
%7.0f gives "1234567" 7 digits, width = 7 characters
[/tt]


HTH,

p5wizard
 
Hi p5wizard
This is more complicated than I thought.
The problem is that the number of digits vary widely from line to line in my file.
 
Well, choose the field width big enough to accommodate the maximum possible number of digits. Awk will right-align.

Or if you want left aligned, use "%-10.0f" format for example. Experiment and learn!

HTH,

p5wizard
 
Hi

If there is no fixed width, then why are you specifying it ? Remove the width and keep only the precision to force the rounding : [tt]printf [green]"%s %.0f %.0f[/green][lime]\n[/lime][green]"[/green][/tt] .

Feherke.
 
Well, I just thought about that too, but you beat me to it, Feherke. ;-)

HTH,

p5wizard
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top