Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Sum multiple columns output to single row to in separate file. 2

Status
Not open for further replies.

chino10

Technical User
Jun 23, 2011
16
0
0
US
I am hoping someone can help me find a way to sum two columns but output the answer into a separate file using the first row of the original file as such:

0000229351,4,5.00,533017068270,31.96,UPAC
0000229351,4,3.00,533017068280,27.08,UPAC
0000229351,4,5.00,533017068291,31.96,UPAC
0000229351,4,5.00,533017068306,31.96,UPAC

Desired Output:
0000229351,4,18.00,533017068270,122.96,UPAC

I am targeting columns 3 and 5 in this example and using the information from row 1 in columns 1,2,4,6. Any help with this issue would be greatly appreciated. Thank you.
 
Code:
awk -F, 'NR==1{split($0,x,",")}NR>1{x[3]+=$3;x[5]+=$5}END{OFS=",";print x[1],x[2],x[3],x[4],x[5],x[6]}' /path/to/input

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Thank you PHV. Sorry it took so long to respond. I forgot to mention I am using gawk in a windows command line environment and am receiving the following error:

Code:
C:\Windows\system32>awk -F, "NR==1{split($0,x,",")}NR>1{x[3]+=$3;x[5]+=$5}END{OF
S=",";print x[1],x[2],x[3],x[4],x[5],x[6]}" /path/to/input 
awk: NR==1{split($0,x,,)}NR>1{x[3]+=$3;x[5]+=$5}END{OFS=,;print x[1],x[2],x[3],x
[4],x[5],x[6]}
awk:                  ^ syntax error
awk: NR==1{split($0,x,,)}NR>1{x[3]+=$3;x[5]+=$5}END{OFS=,;print x[1],x[2],x[3],x
[4],x[5],x[6]}
awk:                   ^ syntax error
awk: fatal: 0 is invalid as number of arguments for split

The only change I made to your original code was changing the single quotes to double quotes because windows doesn't like single. I also changed the command to gawk but still have the same problem. Any help you might have would be greatly appreciated. Thank you.
 
Use a file to store the awk program, eg chino10.awk:
Code:
NR==1{split($0,x,",")}
NR>1{x[3]+=$3;x[5]+=$5}
END{OFS=",";print x[1],x[2],x[3],x[4],x[5],x[6]}
And then:
gawk -F, -f chino10.awk \path\to\input

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Thank you PHV this worked. However, I did not stop to consider the possibility that the input file may contain additional lines of information that are not related to each other but still aggregate the same:

0000229351,4,5.00,533017068270,31.96,UPAC
0000229351,4,3.00,533017068280,27.08,UPAC
0000229351,4,5.00,533017068291,31.96,UPAC
0000229351,4,5.00,533017068306,31.96,UPAC
0000229355,3,3.00,533017067708,22.51,UPAC
0000229355,3,6.00,533017067715,25.33,UPAC
0000229466,2,10.00,533017067890,16.25,UPAC

Desired Output:
0000229351,4,18.00,533017068270,122.96,UPAC
0000229355,3,9.00,533017067708,47.84,UPAC
0000229466,2,10.00,533017067890,16.25,UPAC
 
A starting point:
Code:
BEGIN{FS=OFS=","}
function myprint(){
  if(NR>1)print x[1],x[2],x[3],x[4],x[5],x[6]
  split($0,x,",");b=$1
}
b!=$1{myprint();next}
{x[3]+=$3;x[5]+=$5}
END{myprint()}
And then:
gawk -f chino10.awk input

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
PHV you rock! This works beautifully. Thank you for your help and patience.
 
PHV, I found a slight problem on our end when I use your script:

Code:
BEGIN{FS=OFS=","}
function myprint(){
  if(NR>1)print x[1],x[2],x[3],x[4],x[5],x[6]
  split($0,x,",");b=$1
}
b!=$1{myprint();next}
{x[3]+=$3;x[5]+=$5}
END{myprint()}

Although created for mulitple lines of data we have the actual possibility of there only being a single line of data. Can this be altered to do continue doing what it is doing but include the ability to output a single line of data if it exists? Any help with this alteration would be greatly appreciated. Thank you.
 
Rearranging the code slightly should fix that:

Code:
BEGIN{FS=OFS=","}
function myprint(){
  print x[1],x[2],x[3],x[4],x[5],x[6]
}
b!=$1{if(NR>1)myprint();split($0,x,",");b=$1;next}
{x[3]+=$3;x[5]+=$5}
END{myprint()}

Annihilannic
[small]tgmlify - code syntax highlighting for your tek-tips posts[/small]
 
This works fantastic. Annihilannic thank you very much!
 
Yes, I'm back with another challenge. Recently this script ran against an empty file (I just now became aware of this possibility) and it output the following:

Code:
,,,,,

Is there a way to make this script output an empty file if the file it is run against is empty of any data? Thank you for any help you might be able to provide.
 
Just add a check in the myprint() function to see whether there is any data in x[1] before printing.

Annihilannic
[small]tgmlify - code syntax highlighting for your tek-tips posts[/small]
 
Thank you Annihilannic. I tried to create a check however, my skills are basically nill. I have only managed to create a blank line before the commas.
 
function myprint(){
[!]if(x[1]!="")[/!]print x[1],x[2],x[3],x[4],x[5],x[6]
}


Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Thank you PHV. I guess I wasn't too far off. Am I correct in assuming that what this states is if x[1] is not empty then print all 6 fields "else" don't print anything. The "else" is basically implied?
 
I just wanted to tell you both thank you very much for your help and your patience.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top