Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

awk command to delete specific columns

Status
Not open for further replies.

Baika

Technical User
Dec 19, 2013
1
0
0
Hi All:
I have a table that has more than 30,000 columns. Below is a small snapshot of the table. I am looking for a script to delete columns that contain only the number "1" from top to bottom.

ST L1 L2 L3 L4 L5
ST2 1 1 1 1 1
ST2 1 0 1 0 1
ST3 1 0 1 0 1
ST3 0 0 1 1 1
ST4 1 0 1 0 1
ST5 1 0 1 0 1
ST6 1 0 1 0 1
ST7 0 0 1 1 1
ST8 0 0 1 0 1
ST9 1 0 1 0 1

Basically, I am looking for an output like this:
ST L1 L2 L4
ST2 1 1 1
ST2 1 0 0
ST3 1 0 0
ST3 0 0 1
ST4 1 0 0
ST5 1 0 0
ST6 1 0 0
ST7 0 0 1
ST8 0 0 0
ST9 1 0 0

Any help would be greatly appreciated.

Thanks in advance,

baika
 
What have you tried so far and where in your code are you stuck ?

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
The following awk script should do it:

Code:
BEGIN{ 
 COLS=6
 for(i=2; i<=COLS; ++i) {
  j=(i-1)
  P[j]=1
  }
}
{ 
 ID[NR]=$1
 if(NR>1) {
  for(i=2; i<=COLS; ++i) {
   j=(i-1)
   C[j]=$i
   if(P[j]==1) {
    if(C[j]==0) P[j]=0
    }
   ROW[NR,j]=$i
   }
  }
 else {
  for(i=2; i<=COLS; ++i) {
   j=(i-1)
   ROW[NR,j]=$i
   }
  }
} 
END {
 for(z=1; z<=NR; ++z) {
  printf ID[z];
  for(i=2; i<=COLS; ++i) {
   j=(i-1);
   if(P[j]==0) printf " "ROW[z,j]
   }
  printf "\n"
  }
}

To explain what I have tried to do here:
The BEGIN part just sets up:
the variable "COLS". To contain the number of columns you have in the input data. You can change this if you add more data in the future.
The array "P". This will be used to store the previous state of the individual columns containing data.

The Main part:
Saves all rows and columns in a psuedo multidimensional array "ROW" including this header row. It is assumed that all data will have the header.
Then for each line that contains data it loops through each column in the row and if the previous state of that column is 1 then it will check if the current occurance is 0. If it is then it has changed and we are no longer interested in it so we set the P array for that column to 0 and move on.

Then at the END part:
For each row of input data we had we evaluate the P array. If this is 0 for any of the columns then we had a change of state and want to print this to the screen. Any other columns are just ignored.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top