First field repeat in all records 1

azekry · Jul 30, 2015

Hi
I have a file that looks like this

198606010000 1139.0000 403.3000
test1 1514.0000 487.7000
1814.0000 540.0000
2114.0000 611.5000
198606270000 717.0000 0.0000
test2 1654.0000 287.0000
2727.0000 558.0000
3012.0000 610.0000
4209.0000 832.0000
4365.0000 858.0000
4683.0000 905.0000
4868.0000 941.0000
I need to repeat the number in column1 198606010000 in all the records as the first field overwriting the text test1 until i reach to the second number 198606270000 and so on so the file looks like this
198606010000 1139.0000 403.3000
198606010000 1514.0000 487.7000
198606010000 1814.0000 540.0000
198606010000 2114.0000 611.5000
198606270000 717.0000 0.0000
198606270000 1654.0000 287.0000
198606270000 2727.0000 558.0000
198606270000 3012.0000 610.0000

I can probably do this in excel but the file is huge so I was wondering if this is possible in AWK.

mikrom · Jul 30, 2015

I tried this:

azekry1.awk

Code:

# Run:
# awk -f azekry1.awk azekry1.txt
{
  if ($1 ~ /1986/) {
    col1 = $1
  } else if ($1 ~ /test/) {
    $1=col1
  } else {
    $0 = col1 " " $0
  }
  print $0  
}

i have the input file
azekry1.txt

Code:

198606010000 1139.0000 403.3000 
test1 1514.0000 487.7000 
1814.0000 540.0000 
2114.0000 611.5000 
198606270000 717.0000 0.0000 
test2 1654.0000 287.0000 
2727.0000 558.0000 
3012.0000 610.0000 
4209.0000 832.0000 
4365.0000 858.0000 
4683.0000 905.0000 
4868.0000 941.0000

when I run it, the I get the output

Code:

$ awk -f azekry1.awk azekry1.tx
198606010000 1139.0000 403.3000
198606010000 1514.0000 487.7000
198606010000 1814.0000 540.0000
198606010000 2114.0000 611.5000
198606270000 717.0000 0.0000
198606270000 1654.0000 287.0000
198606270000 2727.0000 558.0000
198606270000 3012.0000 610.0000
198606270000 4209.0000 832.0000
198606270000 4365.0000 858.0000
198606270000 4683.0000 905.0000
198606270000 4868.0000 941.0000

Is that what have you desired ?

mikrom · Jul 30, 2015

Hi azekry,
You seem to be new in this forum, so please next time post your example data between the tags:
[pre]

Code:

[/pre]
[pre]...[/pre]
[pre]

[/pre]

azekry · Jul 31, 2015

Yes thanks a lot that is what i want only the first column does not always start with /1986/ can i change it to something like /[0-9]/ to match any number value in the first column

mikrom · Jul 31, 2015

You need to know what is the difference between your affected first column and other first columns.
For exampple if you know, that it has on the beginning minimal 8 digits (maybe date format yyyymmdd), you could try

Code:

# Run:
# awk -f azekry1.awk azekry1.txt
{
  if ($1 ~ /[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/) {
    col1 = $1
  } else if ($1 ~ /test/) {
    $1=col1
  } else {
    $0 = col1 " " $0
  }
  print $0  
}

mikrom · Jul 31, 2015

If I'm using gawk ( have version 3.1.7) with the switch --posix I could write the regeex above shorter

Code:

# Run:
# gawk --posix -f azekry1.awk azekry1.txt
{
  if ($1 ~ /[0-9]{12}/) {
    col1 = $1
  } else if ($1 ~ /test/) {
    $1=col1
  } else {
    $0 = col1 " " $0
  }
  print $0  
}

azekry · Jul 31, 2015

that worked for me. Thank you so much for your help

azekry · Aug 2, 2015

Please I need to ask another question I have another file that looks like this with variable text in the fourth column

Code:

200004080000            5517.9888      941.1580       car_2_f
                        5544.0151      945.1580   
                        5569.9360      949.1580   
196511290000            328.0000       0.0000         TZTho           
                        4917.8857      1296.0000  
                        14100.9385     2650.0000

I need to format it like this:

Code:

200004080000            5517.9888      941.1580       car_2_f
200004080000            5544.0151      945.1580       car_2_f
 200004080000           5569.9360      949.1580       car_2_f
196511290000            328.0000       0.0000         TZTho           
196511290000            4917.8857      1296.0000      TZTho           
196511290000            14100.9385     2650.0000      TZTho

mikrom · Aug 2, 2015

I tried this

Code:

# Run:
# awk -f azekry2.awk azekry2.txt
{
  if ($1 ~ /[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/) {
    col1=$1
    col4=$4
  } else {
    $3=$2
    $2=$1
    $1=col1
    $4=col4
  }
  printf "%12d\t%10.4f\t%10.4f\t%s\n", $1, $2, $3, $4  
}

and got this output

Code:

$ awk -f azekry2.awk azekry2.txt
200004080000     5517.9888        941.1580      car_2_f
200004080000     5544.0151        945.1580      car_2_f
200004080000     5569.9360        949.1580      car_2_f
196511290000      328.0000          0.0000      TZTho
196511290000     4917.8857       1296.0000      TZTho
196511290000    14100.9385       2650.0000      TZTho

If you are sure that in the first line (..etc) are exactly 4 columns you could use in the if-statement instead of the
condition
[pre]$1 ~ /[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/[/pre]
this condition
[pre](NF==4)[/pre]

azekry · Aug 3, 2015

Yes there are 4 columns for sure. But I am a little confused. I would really appreciate if you could post the full script when you have time.

mikrom · Aug 3, 2015

Code:

# Run:
# awk -f azekry2.awk azekry2.txt
{
  if (NF==4) {
    col1=$1
    col4=$4
  } else {
    $3=$2
    $2=$1
    $1=col1
    $4=col4
  }
  printf "%12d\t%10.4f\t%10.4f\t%s\n", $1, $2, $3, $4  
}

azekry · Aug 3, 2015

Thanks so much for all your great help

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

First field repeat in all records 1

azekry

Technical User

mikrom

Programmer

mikrom

Programmer

azekry

Technical User

mikrom

Programmer

mikrom

Programmer

azekry

Technical User

azekry

Technical User

mikrom

Programmer

azekry

Technical User

mikrom

Programmer

azekry

Technical User

Similar threads

Part and Inventory Search

Sponsor