Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

join columns from many files with the same structure using awk 2

Status
Not open for further replies.

babcia01

IS-IT--Management
Jul 11, 2002
37
US
I have x number of files which are named using constistent naming convention mmdd.file (where mm is a month number and dd is a day number, "file" is a constant.

What is the best way to approach the following:

I will explain it by using this example:

0723.file contains the following tab delimited columns:

row1_col1_in_0723.file row1_col2_in_0723.file
row2_col1_in_0723.file row2_col2_in_0723.file
:

0724.file contains the following tab delimited columns:
row1_col1_in_0724.file row1_col2_in_0724.file
row2_col1_in_0724.file row2_col2_in_0724.file
:

In my output file, I want to have the following:
row1_col2_in_0723.file row1_col2_in_0724.file
row2_col2_in_0723.file row2_col2_in_0724.file
:
Thank you very much
 
Try this script
Code:
{
  printf $2 " "
  getline < &quot;-&quot;
  print $2
}
Put it in a file, join.awk say, and enter

awk -f join.awk 0723.file < 0724.file > new.file CaKiwi
 
Upon further review I see that you want to do this for many files, not just 2. The best way I can think of to do this is to use awk (or cut) to create a set of files with just the second column and then use paste to concatenate them. CaKiwi
 
Thank you very much for your quick response.
Before I asked my original question, I developed something like this, which is working:

for i in `ls *.file`
{
sed 1d &quot;$i&quot;|awk '{print $2}' > &quot;$i&quot;.new
}
paste *.new

I also, needed to delete first line. I thought that this is not too efficient and &quot;nice&quot; looking and creates unnecessary files.
I though that may be there is some other way with getline/nextfile.
I am glad that I concluded the same solution, which you suggested, although, before I did it, I spent too much time until I found this useful &quot;paste&quot; command.
Thank you once again for your suggestions.
 
Just read all files in awk and append the field 2 of each line to an array element whose index is the number of the line in the file (FNR = current File Number of Record). To get rid of the first line of each file just do not print the array element whose index is 1. The
Code:
substr
is use to remove the &quot; &quot; char added at start of each line by the concatenation command (
Code:
string[FNR] &quot; &quot; $2
).

Code:
awk '
{
  string[FNR] = string[FNR] &quot; &quot; $2;
  if (maxFNR < FNR) maxFNR = FNR;
}
END {
  for (i = 2; i <= maxFNR; i++)
    print substr(string[i], 2);
}' *.file
 
Thanks a lot for the 100 % correct and working solution.
What a great site!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top