So, I've been working a bit more on my script from a previous post in order to extract even more data without going through excel. Now I've hit another bump and I'm hoping someone with more programming skills than me might help me solve this issue.
Below is the script in question:
#!/usr/bin/awk -f
BEGIN{t="/dev/tty";printf "Enter number of molecules to average: ">t;getline<t;inp_num=$1}
{
out1 = "hb_%_occ_" FILENAME;
out2 = "summary_" FILENAME;
gsub (/\(+|\)/," ");
tothb += $10;
tott += $15;
if (NF>=15) {++denom;printf "%10.2f %10.1f\n", $10, $15 > out1}
avgocc = tothb/inp_num;
}
END {
avglt = (tott/denom);
tottsq += (($15-avglt)**2);
print "Summary data for hbond analysis" > out2;
printf ("\n") > out2;
printf (" Sum of Occupancy: %10.2f\n", tothb) > out2;
printf (" Average Occypancy: %10.2f\n", avgocc) > out2;
printf ("\n") > out2;
printf (" Sum of lifetimes: %10.2f\n", tott) > out2;
printf (" Average lifetime: %10.2f\n", avglt) > out2;
printf (" SD lifetime: %10.2f\n", sqrt(tottsq/denom)) > out2;
}
The code not working is highlighted. As you might gather, the problem occurs with the sum of squared sums, I want to take each value in $15-avglt (the average value calculated), square these numbers and sum them up sum(($15-avglt)^2). This in order to divide this with the denom (number of rows containing numbers and not text) to get the STDEV.
I don't even know if this will work, as I mentioned in previous thread, the first 13 rows contains text in the input file, the last line is always just non-sense signs. I don't know if the script I have will ignore these lines when summing and calculating for $15, but this is my belief since I get an output number, just not the right value!
Any suggestions are more than welcome
Best regards
Gustaf
Below is the script in question:
#!/usr/bin/awk -f
BEGIN{t="/dev/tty";printf "Enter number of molecules to average: ">t;getline<t;inp_num=$1}
{
out1 = "hb_%_occ_" FILENAME;
out2 = "summary_" FILENAME;
gsub (/\(+|\)/," ");
tothb += $10;
tott += $15;
if (NF>=15) {++denom;printf "%10.2f %10.1f\n", $10, $15 > out1}
avgocc = tothb/inp_num;
}
END {
avglt = (tott/denom);
tottsq += (($15-avglt)**2);
print "Summary data for hbond analysis" > out2;
printf ("\n") > out2;
printf (" Sum of Occupancy: %10.2f\n", tothb) > out2;
printf (" Average Occypancy: %10.2f\n", avgocc) > out2;
printf ("\n") > out2;
printf (" Sum of lifetimes: %10.2f\n", tott) > out2;
printf (" Average lifetime: %10.2f\n", avglt) > out2;
printf (" SD lifetime: %10.2f\n", sqrt(tottsq/denom)) > out2;
}
The code not working is highlighted. As you might gather, the problem occurs with the sum of squared sums, I want to take each value in $15-avglt (the average value calculated), square these numbers and sum them up sum(($15-avglt)^2). This in order to divide this with the denom (number of rows containing numbers and not text) to get the STDEV.
I don't even know if this will work, as I mentioned in previous thread, the first 13 rows contains text in the input file, the last line is always just non-sense signs. I don't know if the script I have will ignore these lines when summing and calculating for $15, but this is my belief since I get an output number, just not the right value!
Any suggestions are more than welcome
Best regards
Gustaf