Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Dear users, 1 2 2 3 1 2 3

Status
Not open for further replies.

fcolassie

Technical User
Dec 12, 2003
12
NL
Dear users,

1 2
2 3
1 2
3 4
5 4
1 2
3 4
5 2
2 7

How can I make subsets of colomns in awk. Like this: if column 1 equals 1 then output of column 2 should be printed in a new column and if column1 equals 2 columns 2 should be printed in another new column too and so on.

Thanks in advance

 
How about this
Code:
#!/bin/awk -f

BEGIN {
    commas = ",,,,,,,,,,,,,,,,,,,"
}
{
  print substr(commas,1,$1) $2
}
Produced this output
Code:
,2
,,3
,2
,,,4
,,,,,4
,2
,,,4
,,,,,2
,,7

--
 
Dare Salem,
I was more interested in a output like this:
2 3 4 4
2 7 4 2
2

Where colomsn one is not printed and when columsn equals 1 then columns 2 in printed in the first new colums and when column 1 equals 2 then column two is outputted in the new columns 2 etc.

Thanks in advance
 
There are too many 2,3,4 in your example
Can you post a better one, where all the numbers are unique, so we can more easily see where each is coming from.

--
 
Suppose I have the following data set with two columns. In which column one is the independend variable and column two the depend variable:
1 2
1 3
2 4
2 5
3 6
3 7

Now assume the following command. If column 1 equals 1 print colomn two etc. The output would be like this (column one see above is not printed):

2 4 6
3 5 7

 
something like that:

nawk -f fcol.awk myFile.txt

#------------------------ fcol.awk

NR % 2 { r1 = (NR == 1) ? $2 : (r1 " " $2)}
!(NR % 2) { r2 = (NR == 2) ? $2 : (r2 " " $2)}
END {
print r1 "\n" r2;
}

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Vlad,

Not exactly what I want. I am doing a lot of data analysis. Most of the times those data sets consists of more than 100000 recors. FOr that purpose, excel can not be used. Now suppose I have the following data set:
1 2
2 4
3 5
4 2
5 4
1 5
2 6
3 3
4 5
5 3

The output should be as follow:
2 4 5 2 4
5 6 3 5 3

In the first colums all values are printen if column 1 equals 1 in the second column values are printed when columns 1 equals 2 etc.
 
So is the first column always sorted, in that it runs 1 to 5, 1 to 5 and so on?

Because if it is, then its simply
Code:
{ print $2 &quot; &quot;; if ( $1 == 5 ) print &quot;\n&quot; }

--
 
No it is not that simple. Let me give you an example of an unique data set. I will give you an overview of my data set. Currently I am doing a lot of data analysis in the field of medicines. Now suppose I have two patients (which are have identification number 1 and 2) and I have measured heart rate at different time intervals (TIME). Time points are presented in column 2 and heart rate (HRATE) is presented in column 3.

The data set would look like this:
ID TIME HRATE
1 0 99
1 10 87
1 20 53
1 30 64
1 40 82
2 0 105
2 10 92
2 20 84
2 30 78
2 40 89

Now I would like to average the heart rate of patient 1 and 2 (ID 1 and 2) over the different time intervals. SO at time point 10 I want average 87 and 92 (This is just a simple example in real I have a data set covering 1000 patients). The same at time point 20 etc.

I am aware of the fact that I can perform suchs tasks with awk ( I am doing that). However, there are other people at my department who are not experienced in using awk. So therefore they have to use Excel to do some data analysis. Therefore I would like to output the outcome at time point in different columns. Which looks like this
0 10 20 30 40 (time points)
99 87 53 64 82 (measurements of ID 1)
105 92 84 78 89 (measurement of ID 2)

Now they can use Excel to average column 1, 2 ,3,4 and 5
 
I think this is the same question as another recent post, just re-phrased. Here's my solution....
[tt]
#!/usr/bin/awk -f
{
if ($1 in col) {
cnt=++col[$1];
} else {
col[$1]=1;
row[0]=row[0] &quot; &quot; $1;
cnt=1;
};
row[cnt]=row[cnt] &quot; &quot; $2;
if (cnt>max)
max=cnt;
} END {
for(x=1;x<=max;x++)
print row[x]
}
[/tt]
Tested...
[tt]
2 4 5 2 4
5 6 3 5 3
[/tt]
 
OK, somewhat 'brute-force' and NOT ordered output, but should give you a start (based on the most recent file with 3 columns).

nawk -f fcol.awk myFile.txt

#-------------------- fcol.awk
{
cols[$2];
ids[$1];
vals[$1,$2] = $3;
}

END {
for (ci in cols)
printf(&quot;%s &quot;, ci );
printf(&quot;\n&quot;);

for (idsi in ids) {
for (ci in cols)
for (vi in vals) {
split(vi, viA, SUBSEP);
if (viA[1] == idsi && viA[2] == ci) {
printf(&quot;%s &quot;, vals[vi]);
delete vals[vi]
continue;
}
}
printf(&quot;\n&quot;);
}
}


vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
For the three column example...
[tt]
#!/usr/bin/awk -f
{
if ($2 in col) {
cnt=++col[$2];
} else {
col[$2]=1;
row[0]=row[0] &quot; &quot; $2;
cnt=1;
};
row[cnt]=row[cnt] &quot; &quot; $3;
if (cnt>max)
max=cnt;
} END {
for(x=0;x<=max;x++)
print row[x]
}
[/tt]
Tested...
[tt]
0 10 20 30 40
99 87 53 64 82
105 92 84 78 89
[/tt]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top