Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Multiple file combination and data evaluation

Status
Not open for further replies.

rklad77

Technical User
Oct 2, 2003
5
US
Hello!!
respected gurus.

I have this problem of combining multiple files with 2 columns each and then finding out the closest match to the reference column.
I will try to explain it more explicitly:-

I have got first file called "test" with number of rows and only two columns separated by comma. The second column in this file we can call as "reference column".

file "test" reads

5,11
10,22
15,33
..,.. and so on

After doing number of experiments I have got number of files
i called "Run1", "Run2", etc. each with its first column
same as "test".

example
file "Run1" reads
5,10
10,24
15,35
..,.. etc

first I need to combine the "test" file and "Run1 --RunN" files. so that it has common first column and rest of columns from "test" and "Run.." files.

After this file is formed i need to find the sum of squares of differences between the "reference column" and "Run.." columns i.e

if i take first three rows of the "test" file and "Run1" file mentioned above

i get combined file as:

5,11,10,
10,22,24
15,33,25
------------

and i need:-

A1= sq(11-10)+ sq(22-24) + sq(33-35)

where:-
sq= square
A1= sum of square of differences for reference column and
run1 column.

the programme should give these answers for all the run files "Run1", "Run2" etc.

I hope i am clear.

Thanks in advance
 
Try

awk -f rklad77.awk test run1 run2 ...

# ------ rklad77.awk ------
BEGIN {FS=","}
{
if (FNR==1) n1++
a[n1,$1] = $2
if (n1==1) b[++n2] = $1
}
END{
for (j=1;j<=n2;j++) {
printf b[j] &quot;,&quot;
for (k=1;k<=n1;k++) {
printf a[k,b[j]]
if (k < n1) printf &quot;,&quot;
}
print &quot;&quot;
}
}

CaKiwi

&quot;I love mankind, it's people I can't stand&quot; - Linus Van Pelt
 
Create a script called prog.awk....

#!/usr/bin/awk -f
BEGIN { FS=&quot;,&quot;;
while (getline<&quot;test&quot;)
test[$1]=$2}
FNR==1 {name[++n]=FILENAME}
{klud[FILENAME,$1]=$2}
END {for (i in name) {sum=0;
for (j in test)
sum+=(test[j]-klud[name,j])^2;
print name &quot;:&quot;, sum } }

Run it like this....

prog.awk Run* | sort -n +1

Tested....

Run1: 9
Run2: 53


 
Thanks both the programmes worked.
Its of great help.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top