Dear users, 1 2 2 3 1 2 3

fcolassie · Dec 13, 2003

Dear users,

1 2
2 3
1 2
3 4
5 4
1 2
3 4
5 2
2 7

How can I make subsets of colomns in awk. Like this: if column 1 equals 1 then output of column 2 should be printed in a new column and if column1 equals 2 columns 2 should be printed in another new column too and so on.

Thanks in advance

Salem · Dec 13, 2003

How about this

Code:

#!/bin/awk -f

BEGIN {
    commas = &quot;,,,,,,,,,,,,,,,,,,,&quot;
}
{
  print substr(commas,1,$1) $2
}

Produced this output

Code:

,2
,,3
,2
,,,4
,,,,,4
,2
,,,4
,,,,,2
,,7

--

fcolassie · Dec 13, 2003

Dare Salem,
I was more interested in a output like this:
2 3 4 4
2 7 4 2
2

Where colomsn one is not printed and when columsn equals 1 then columns 2 in printed in the first new colums and when column 1 equals 2 then column two is outputted in the new columns 2 etc.

Thanks in advance

Salem · Dec 13, 2003

There are too many 2,3,4 in your example
Can you post a better one, where all the numbers are unique, so we can more easily see where each is coming from.

--

fcolassie · Dec 13, 2003

Suppose I have the following data set with two columns. In which column one is the independend variable and column two the depend variable:
1 2
1 3
2 4
2 5
3 6
3 7

Now assume the following command. If column 1 equals 1 print colomn two etc. The output would be like this (column one see above is not printed):

2 4 6
3 5 7

vgersh99 · Dec 13, 2003

something like that:

nawk -f fcol.awk myFile.txt

#------------------------ fcol.awk

NR % 2 { r1 = (NR == 1) ? $2 : (r1 " " $2)}
!(NR % 2) { r2 = (NR == 2) ? $2 : (r2 " " $2)}
END {
print r1 "\n" r2;
}

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+

fcolassie · Dec 14, 2003

Vlad,

Not exactly what I want. I am doing a lot of data analysis. Most of the times those data sets consists of more than 100000 recors. FOr that purpose, excel can not be used. Now suppose I have the following data set:
1 2
2 4
3 5
4 2
5 4
1 5
2 6
3 3
4 5
5 3

The output should be as follow:
2 4 5 2 4
5 6 3 5 3

In the first colums all values are printen if column 1 equals 1 in the second column values are printed when columns 1 equals 2 etc.

Salem · Dec 14, 2003

So is the first column always sorted, in that it runs 1 to 5, 1 to 5 and so on?

Because if it is, then its simply

Code:

{ print $2 &quot; &quot;; if ( $1 == 5 ) print &quot;\n&quot; }

--

fcolassie · Dec 14, 2003

No it is not that simple. Let me give you an example of an unique data set. I will give you an overview of my data set. Currently I am doing a lot of data analysis in the field of medicines. Now suppose I have two patients (which are have identification number 1 and 2) and I have measured heart rate at different time intervals (TIME). Time points are presented in column 2 and heart rate (HRATE) is presented in column 3.

The data set would look like this:
ID TIME HRATE
1 0 99
1 10 87
1 20 53
1 30 64
1 40 82
2 0 105
2 10 92
2 20 84
2 30 78
2 40 89

Now I would like to average the heart rate of patient 1 and 2 (ID 1 and 2) over the different time intervals. SO at time point 10 I want average 87 and 92 (This is just a simple example in real I have a data set covering 1000 patients). The same at time point 20 etc.

I am aware of the fact that I can perform suchs tasks with awk ( I am doing that). However, there are other people at my department who are not experienced in using awk. So therefore they have to use Excel to do some data analysis. Therefore I would like to output the outcome at time point in different columns. Which looks like this
0 10 20 30 40 (time points)
99 87 53 64 82 (measurements of ID 1)
105 92 84 78 89 (measurement of ID 2)

Now they can use Excel to average column 1, 2 ,3,4 and 5

Ygor · Dec 15, 2003

I think this is the same question as another recent post, just re-phrased. Here's my solution....
[tt]
#!/usr/bin/awk -f
{
if ($1 in col) {
cnt=++col[$1];
} else {
col[$1]=1;
row[0]=row[0] " " $1;
cnt=1;
};
row[cnt]=row[cnt] " " $2;
if (cnt>max)
max=cnt;
} END {
for(x=1;x<=max;x++)
print row[x]
}
[/tt]
Tested...
[tt]
2 4 5 2 4
5 6 3 5 3
[/tt]

vgersh99 · Dec 15, 2003

OK, somewhat 'brute-force' and NOT ordered output, but should give you a start (based on the most recent file with 3 columns).

nawk -f fcol.awk myFile.txt

#-------------------- fcol.awk
{
cols[$2];
ids[$1];
vals[$1,$2] = $3;
}

END {
for (ci in cols)
printf("%s ", ci );
printf("\n&quot

;

for (idsi in ids) {
for (ci in cols)
for (vi in vals) {
split(vi, viA, SUBSEP);
if (viA[1] == idsi && viA[2] == ci) {
printf("%s ", vals[vi]);
delete vals[vi]
continue;
}
}
printf("\n&quot

;
}
}

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+

Ygor · Dec 15, 2003

For the three column example...
[tt]
#!/usr/bin/awk -f
{
if ($2 in col) {
cnt=++col[$2];
} else {
col[$2]=1;
row[0]=row[0] " " $2;
cnt=1;
};
row[cnt]=row[cnt] " " $3;
if (cnt>max)
max=cnt;
} END {
for(x=0;x<=max;x++)
print row[x]
}
[/tt]
Tested...
[tt]
0 10 20 30 40
99 87 53 64 82
105 92 84 78 89
[/tt]

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Dear users, 1 2 2 3 1 2 3

fcolassie

Technical User

Salem

Programmer

fcolassie

Technical User

Salem

Programmer

fcolassie

Technical User

vgersh99

Programmer

fcolassie

Technical User

Salem

Programmer

fcolassie

Technical User

Ygor

Programmer

vgersh99

Programmer

Ygor

Programmer

Similar threads

Part and Inventory Search

Sponsor