Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Summing Column Data 2

Status
Not open for further replies.

thunderkid

Technical User
Oct 26, 2000
54
US
I am trying to sums up col2 data in following data file
# ----- data.txt -------
alpha 2
alpha 3
beta 4
beta 6
beta 10
gamma 1
alpha 4
alpha 10

desired results
Col1 sums
alpha 9
beta 10
gamma 1
alpha 14

Note that col1 fields may be repeated, but I want to make new calculation when they are repeated.

I have the tried to modify awk code (vgersh99) for another thread below with no success. This awk code sums all col 2 quantities, which is not what I want.

{ a[$1] +=$2 }
END {
for (i in a)
print i, a;
}


thunderkid
 
can you elaborate on - I don't see the pattern:
Note that col1 fields may be repeated, but I want to make new calculation when they are repeated.

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Brute force method:
NR==1{t=0;k=$1}
$1==k{t+=$2;next}
{print k,t;t=$2;k=$1}
END{if(t)print k,t}

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
Thanks PH. Just what I needed. You get a star.
thunderkid
 

Code:
1 == NR { key = $1 }
$1 != key { print key,sum; key = $1; sum = 0 }
{ sum += $2 }
END { print key,sum }
The output is
[tt]
alpha 5
beta 20
gamma 1
alpha 14
[/tt]
Let me know whether or not this helps.

If you have nawk, use it instead of awk because on some systems awk is very old and lacks many useful features. For an introduction to Awk, see faq271-5564.
 
I must be gettin' thick-skulled.

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
ah, I think I can see the light!
thanks

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
futurelet,
your solution worked.

vgersh99,
as for my comment on "col1 fields may be repeated", here is further explanation. In those cases where column 1 repeats a value, I wanted to make a new calculation for each occurence. In my example, alpha is repeated, so two calculations I wanted two calcs. In the example given first alpha calc was 5 and second calc was 14. I did not want all alpha records added together. In this case alpha would have been 19 (incorrect).

thunderkid
 
as futurelet noted your sample output did not jive with your explanation.

A better explanation might be:
output the sum of the 2nd column while 1st column stays the same.

Clearer now?


vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
To err is human.

I once wrote an Awk program that was only 4 characters long; even code that short had a bug in it.
Code:
$1,0
The problem was to skip empty lines at start of file, but to print first non-empty line and all following lines, including blank lines.

Someone pointed out that my code would fail if the first non-empty line had only "0"; he posted the correct solution:
Code:
NF,0
 
yes, I do remember the thread.
I simply had problems understanding the explanation.

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
If I remember correctly, I stole the yellow part from you: [tt]$1[COLOR=black yellow],0[/color][/tt]
 
naaah, I remember neither participating in this thread or being anything stolen from me ;)

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
You had previously used it in another thread; I think PHV complimented you for it.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top