Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Comparing fields and deleting columns

Status
Not open for further replies.

Jalr2003

Technical User
May 15, 2003
26
0
0
US
I have the following file

A B C D E F G
- - - - - - -
1 2 3 4 5 6 7
1 3 3 4 8 6 8

I want to compare lines 2 and 3 column by column, and print the complete column when those two values are different.

In this case, the intended output would be

B E G
- - -
2 5 7
3 8 8

Is there a way to do that using awk or nawk?

The input file will always have 4 lines, but can have as many as 30 columns. I just want to be able to show the columns where the values changed.

Thanks

Jalr2003

 
You could adapt something like this:
Code:
{

    for (x=1 ; x <= NF ; x++) {
         if (NR > 2 && xold[x] && $x != xold[x]) {
            printf(&quot;At %d -- %d : %s ! %s\n&quot;,NR,x,xold[x], $x)
          }
         if ($x ~ /[0-9]+/) {xold[x] = $x}
    }
}


Test:
File = {
A B C D E F G
- - - - - - -
1 2 3 4 5 6 7
1 3 3 4 8 6 8
5 2 1 5 9 2 3
}

Output:
At 4 -- 2 : 2 ! 3
At 4 -- 5 : 5 ! 8
At 4 -- 7 : 7 ! 8
At 5 -- 1 : 1 ! 5
At 5 -- 2 : 3 ! 2
At 5 -- 3 : 3 ! 1
At 5 -- 4 : 4 ! 5
At 5 -- 5 : 8 ! 9
At 5 -- 6 : 6 ! 2
At 5 -- 7 : 8 ! 3
 
Or something like this?

BEGIN {
getline
n1 = split($0,a1)
getline
getline
n2 = split($0,a2)
getline
n3 = split($0,a3)
for(j=1;j<=n3;j++) if (a2[j]!=a3[j]) printf a1[j] &quot; &quot;
print &quot;&quot;
for(j=1;j<=n3;j++) if (a2[j]!=a3[j]) printf &quot;- &quot;
print &quot;&quot;
for(j=1;j<=n3;j++) if (a2[j]!=a3[j]) printf a2[j] &quot; &quot;
print &quot;&quot;
for(j=1;j<=n3;j++) if (a2[j]!=a3[j]) printf a3[j] &quot; &quot;
print &quot;&quot;
exit
}

You might need to check whether n1, n2 and n3 are equal and format the output better.

CaKiwi

&quot;I love mankind, it's people I can't stand&quot; - Linus Van Pelt
 
I tried caKiwi's suggestion, but I got blank lines in the output.

Any ideas why this happens?

Thanks for your answers,

Jalr2003
 
Put some debug print statements after the splits

printf n1
for (j=1;j<=n1;j++) printf &quot; &quot; a1[j]
print &quot;&quot;

CaKiwi

&quot;I love mankind, it's people I can't stand&quot; - Linus Van Pelt
 
Never mind,

I was missing the second line in my input file.

Thaks,

Jalr2003
 
One last question,

The lenght of the fields that I have in my input file is not the same. What can I do to get a better organized output.

I have for input something like this:

SYS_ID LIMIT CODE PART_NUM
------ ---- ---- --------
16447 15 1NB 4A0001
19631 14 1NB 4A0001

I changed one of your lines from:

for(j=1;j<=n3;j++) if (a2[j]!=a3[j]) printf &quot;- &quot;

to:

for(j=1;j<=n3;j++) if (a2[j]!=a3[j]) printf a2[j] &quot; &quot;

I get the disered output, but it is disorganized.

Thanks again,

Jalr2003

 
If the fields are all the same length, use something like

for(j=1;j<=n3;j++) if (a2[j]!=a3[j]) printf a2[j] substr(&quot; &quot;,1,8-length(a2[j]))

If they are not the same length, create an array, len say, containing the length of each and replace the above substr by

substr(&quot; &quot;,1,len[j]-length(a2[j]))


CaKiwi

&quot;I love mankind, it's people I can't stand&quot; - Linus Van Pelt
 
Thanks for your help,

I am starting learning about awk, and I am not sure how I can create the array in this case.

Could you help me with that?

Thanks,

Jalr2003
 
The simplest way would be

len[1]=6;len[2]=6;len[3]=6;....

or if you prefer

split(&quot;6,6,6,4,8,6,8&quot;,len,&quot;,&quot;)

CaKiwi

&quot;I love mankind, it's people I can't stand&quot; - Linus Van Pelt
 
Thanks CaKiwi

It certainly looks better!

However, I just noticed something.

If I have an imput like this

A B C D E F G H
- - - - - - - -
1 2 3 4 6 6
2 3 2 4 6 7

my output is

A B C F
- - - -
1 2 3 6
2 3 2 7

As you see, column B gets the values of C, because there are less columns in rows 2 and 3.

Is there a way to eliminate columns B and F, before doing what you suggested?

Thanks again,

Jalr2003
 
If your data can have blank fields, You will need to use substr instead of split to break them up.

BEGIN {
split(&quot;4,4,4,4,4,4,4&quot;,len,&quot;,&quot;)
getline
n1 = split($0,a1)
getline
getline
for (j=1;j<=n1;j++) a2[j] = substr($0,(j-1)*len[j]+1,len[j])
getline
for (j=1;j<=n1;j++) a3[j] = substr($0,(j-1)*len[j]+1,len[j])
for(j=1;j<=n1;j++) if (a2[j]!=a3[j]) printf a1[j] &quot; &quot;
print &quot;&quot;
for(j=1;j<=n1;j++) if (a2[j]!=a3[j]) printf &quot;- &quot;
print &quot;&quot;
for(j=1;j<=n1;j++) if (a2[j]!=a3[j]) printf a2[j] &quot; &quot;
print &quot;&quot;
for(j=1;j<=n1;j++) if (a2[j]!=a3[j]) printf a3[j] &quot; &quot;
print &quot;&quot;
exit
}

CaKiwi

&quot;I love mankind, it's people I can't stand&quot; - Linus Van Pelt
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top