Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

First field repeat in all records 1

Status
Not open for further replies.

azekry

Technical User
Jul 28, 2015
8
EG
Hi
I have a file that looks like this

198606010000 1139.0000 403.3000
test1 1514.0000 487.7000
1814.0000 540.0000
2114.0000 611.5000
198606270000 717.0000 0.0000
test2 1654.0000 287.0000
2727.0000 558.0000
3012.0000 610.0000
4209.0000 832.0000
4365.0000 858.0000
4683.0000 905.0000
4868.0000 941.0000
I need to repeat the number in column1 198606010000 in all the records as the first field overwriting the text test1 until i reach to the second number 198606270000 and so on so the file looks like this
198606010000 1139.0000 403.3000
198606010000 1514.0000 487.7000
198606010000 1814.0000 540.0000
198606010000 2114.0000 611.5000
198606270000 717.0000 0.0000
198606270000 1654.0000 287.0000
198606270000 2727.0000 558.0000
198606270000 3012.0000 610.0000

I can probably do this in excel but the file is huge so I was wondering if this is possible in AWK.

 
I tried this:

azekry1.awk
Code:
# Run:
# awk -f azekry1.awk azekry1.txt
{
  if ($1 ~ /1986/) {
    col1 = $1
  } else if ($1 ~ /test/) {
    $1=col1
  } else {
    $0 = col1 " " $0
  }
  print $0  
}

i have the input file
azekry1.txt
Code:
198606010000 1139.0000 403.3000 
test1 1514.0000 487.7000 
1814.0000 540.0000 
2114.0000 611.5000 
198606270000 717.0000 0.0000 
test2 1654.0000 287.0000 
2727.0000 558.0000 
3012.0000 610.0000 
4209.0000 832.0000 
4365.0000 858.0000 
4683.0000 905.0000 
4868.0000 941.0000

when I run it, the I get the output
Code:
$ awk -f azekry1.awk azekry1.tx
198606010000 1139.0000 403.3000
198606010000 1514.0000 487.7000
198606010000 1814.0000 540.0000
198606010000 2114.0000 611.5000
198606270000 717.0000 0.0000
198606270000 1654.0000 287.0000
198606270000 2727.0000 558.0000
198606270000 3012.0000 610.0000
198606270000 4209.0000 832.0000
198606270000 4365.0000 858.0000
198606270000 4683.0000 905.0000
198606270000 4868.0000 941.0000

Is that what have you desired ?
 
Hi azekry,
You seem to be new in this forum, so please next time post your example data between the tags:
[pre]
Code:
[/pre]
[pre]...[/pre]
[pre]
[/pre]
 
Yes thanks a lot that is what i want only the first column does not always start with /1986/ can i change it to something like /[0-9]/ to match any number value in the first column
 
You need to know what is the difference between your affected first column and other first columns.
For exampple if you know, that it has on the beginning minimal 8 digits (maybe date format yyyymmdd), you could try
Code:
# Run:
# awk -f azekry1.awk azekry1.txt
{
  if ($1 ~ /[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/) {
    col1 = $1
  } else if ($1 ~ /test/) {
    $1=col1
  } else {
    $0 = col1 " " $0
  }
  print $0  
}
 
If I'm using gawk ( have version 3.1.7) with the switch --posix I could write the regeex above shorter
Code:
# Run:
# gawk --posix -f azekry1.awk azekry1.txt
{
  if ($1 ~ /[0-9]{12}/) {
    col1 = $1
  } else if ($1 ~ /test/) {
    $1=col1
  } else {
    $0 = col1 " " $0
  }
  print $0  
}
 
that worked for me. Thank you so much for your help
 
Please I need to ask another question I have another file that looks like this with variable text in the fourth column
Code:
200004080000            5517.9888      941.1580       car_2_f
                        5544.0151      945.1580   
                        5569.9360      949.1580   
196511290000            328.0000       0.0000         TZTho           
                        4917.8857      1296.0000  
                        14100.9385     2650.0000

I need to format it like this:
Code:
200004080000            5517.9888      941.1580       car_2_f
200004080000            5544.0151      945.1580       car_2_f
 200004080000           5569.9360      949.1580       car_2_f
196511290000            328.0000       0.0000         TZTho           
196511290000            4917.8857      1296.0000      TZTho           
196511290000            14100.9385     2650.0000      TZTho
 
I tried this
Code:
# Run:
# awk -f azekry2.awk azekry2.txt
{
  if ($1 ~ /[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/) {
    col1=$1
    col4=$4
  } else {
    $3=$2
    $2=$1
    $1=col1
    $4=col4
  }
  printf "%12d\t%10.4f\t%10.4f\t%s\n", $1, $2, $3, $4  
}
and got this output
Code:
$ awk -f azekry2.awk azekry2.txt
200004080000     5517.9888        941.1580      car_2_f
200004080000     5544.0151        945.1580      car_2_f
200004080000     5569.9360        949.1580      car_2_f
196511290000      328.0000          0.0000      TZTho
196511290000     4917.8857       1296.0000      TZTho
196511290000    14100.9385       2650.0000      TZTho

If you are sure that in the first line (..etc) are exactly 4 columns you could use in the if-statement instead of the
condition
[pre]$1 ~ /[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/[/pre]
this condition
[pre](NF==4)[/pre]
 
Yes there are 4 columns for sure. But I am a little confused. I would really appreciate if you could post the full script when you have time.
 
Code:
# Run:
# awk -f azekry2.awk azekry2.txt
{
  if (NF==4) {
    col1=$1
    col4=$4
  } else {
    $3=$2
    $2=$1
    $1=col1
    $4=col4
  }
  printf "%12d\t%10.4f\t%10.4f\t%s\n", $1, $2, $3, $4  
}
 
Thanks so much for all your great help
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top