Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Use one awk '{... }' instead of three in a bash script 2

Status
Not open for further replies.

FedoEx

Technical User
Oct 7, 2008
49
US
I have the data file which looks like that

Code:
#S 1 …............
#D Tue ….......
…....................
…....................
#S 350  a2scan  oh -4.42031 -3.62031  or 0.298354 0.381046  28 -600000
#D Tue Feb 17 05:02:03 2009                                           
#P7 S4L      S4R      S5L      S5R      S6_GAP   S6_CEN                              
#    1 1 12 12 4.99987 2.89827e-05                                                   
#W   0.413280                                                              
#A  0                                                                        
#N 16                                                                                
#L 
-4.4201082 0.29839999 0 6.05517e-05 0.169237     200205 3.253e+06 2.43859e+06 0 1 0 0 1 29.7685 600000 307
-4.3921682 0.30119999 0 5.66641e-05 0.16998     200236 3.24289e+06 2.43264e+06 0 1 0 0 1 29.6072 600000 326
-3.6494722 0.37829999 0 -6.46472e-05 0.190438     201044 3.22599e+06 2.42768e+06 0 1 0 0 1 29.6273 600000 262
-3.6202622 0.38109999 0 -6.95709e-05 0.191181     201075 3.23008e+06 2.43027e+06 0 1 0 0 1 29.6994 600000 288

Peak at -4.13461 is 442   COM at -4.02685   FWHM is 0 at -4.13461
Sum = 10097  Ave.Mon./Time = 29.7028  Ave.Temp. = 0C             
                                    
#S 351  a2scan  oh -4.42031 -3.62031  or 0.298654 0.381346  28 -600000
#D Tue Feb 17 05:22:03 2009 
…...........
…...........

I've been using the following code to extract numerical data for let say entry 350 from the file called data

Code:
#/bin/bash
SN1="#S $2 "
awk "/$SN1/,/Peak/{print}" $1|awk '!/[f-z]|#|^$/' >/tmp/data.tmp
awk '{printf("%-2.5e %-1.5e %1.5e \n", $1 ,$NF/$(NF-1), $NF/$(NF-1)*sqrt(1/($NF) + 1/($(NF-1))) )}' /tmp/datama.tmp >/tmp/datama
cat /tmp/datama

I put it in my ~/bin and run it like

Code:
$executable.sh data 350
and it will spit me the desired formated columns of data for the given entry.

Is there a way to merge the three awk '{...}' into one only without the need to write those tmp files to the disk .
I have this kind of problem in number of my scripts since I am not very fluent in awk and I usually write them in a hurry to get things done.
I am trying to put some more time in learning awk and sounds like learning pearl in the long run will be quite beneficial for my kind of work
 
awk '/#S '$2'/,/Peak/{if($0!~/[f-z]|#|^$/)printf "%-2.5e %-1.5e %1.5e \n",$1,$NF/$(NF-1),$NF/$(NF-1)*sqrt(1/($NF)+1/$(NF-1))}' $1

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Hi

There is no need for [tt]bash[/tt]. Make it an [tt]awk[/tt] script :
Code:
#!/usr/bin/mawk -f

BEGIN {
  ARGC--
}

$0~("#S " ARGV[2] " "),/Peak/ {
  if (!/[f-z]|#|^$/)
    printf("%-2.5e %-1.5e %1.5e \n", $1 ,$NF/$(NF-1), $NF/$(NF-1)*sqrt(1/($NF) + 1/($(NF-1))) )
}
Tested with [tt]gawk[/tt] and [tt]mawk[/tt].

Feherke.
 
Thanks guys.
I've never looked into the [tt]mawk gawk [/tt].
Do you know if [tt]mawk[/tt] can call external [tt]bash[/tt] commands?
For example in a [tt]bash[/tt] I can call [tt]gnuplot[/tt] to plot that data

Code:
#!/bin/bash
awk  ' /#S  '$2'  / {…..} ' $1 >/tmp/data.tmp

gnuplot -persist << EOF
plot  '/tmp/data.tmp'
EOF

Maybe you can help me merge another two [tt]awk ' {...} '[/tt] in to the code above.
The number of lines for that generated data is always odd number. Ignore the example above where I put four lines.

In my original code in my last
Code:
awk '{printf (“%-2.5e..... ”,  $1,  $NF/$(NF-1)   )} '  data.tmp
I need to have fourth column which is the index of lines starting from zero. So I just put [tt]FNR-1[/tt] inside the
printf to get the output like that

0 -2.37430 2.39556e-02 7.60878e-04
1 -2.36410 8.77916e-03 4.56558e-04
2 -2.35434 1.92326e-03 2.13901e-04
3 -2.34406 1.26115e-03 1.73342e-04
4 -2.33426 7.59121e-04 1.34246e-04


However if I add it in the code with one [tt]awk ' {...} '[/tt] it will print the
index of lines with respect to the original data file. Do you know a fix for that one?
Another awk [tt]awk ' {...} '[/tt] block that I have extracts the middle line from that data.
So for example for the 5 line sample above

Code:
awk '{
                 for (x = 1; x <= NF; x++)
                  vector[x, NR] = $x
                  }
                  END {
                       
                       printf("%s \n", vector[2,(NR-1)/2+1 ])

                       }' sample.dat

will return

-2.35434”
and I assign this as bash variable so I can call it later in another [tt]awk ' {...} '[/tt] block to do
some additional arithmetic operations.
Can you hint me a little on [tt]awk[/tt] variables and manipulations with them.
Like I said I am currently studying the “Gnu AWK users guide” and probably will get there at some point.
Thanks a lot for your great help.
 
Hi

FedoEx said:
I've never looked into the mawk gawk .
They are just different implementations. [tt]gawk[/tt] is quite enhanced with proprietary functions, [tt]mawk[/tt] is the quite the same as the standard.
FedoEx said:
Do you know if mawk can call external bash commands?
As far as I know, all [tt]awk[/tt] implementations can.
FedoEx said:
However if I add it in the code with one awk ' {...} ' it will print the
index of lines with respect to the original data file.
Then simply output a counter :
Code:
printf("[red]%d[/red] %-2.5e %-1.5e %1.5e \n", [red]i++[/red], $1 ,$NF/$(NF-1), $NF/$(NF-1)*sqrt(1/($NF) + 1/($(NF-1))) )
FedoEx said:
Can you hint me a little on awk variables and manipulations with them.
Read the man page : [tt]man awk[/tt]. If you are on Linux, you may have info too : [tt]info awk[/tt].

Feherke.
 
Hi

I still not understand the last part of you previous post. However I would put an [tt]int()[/tt] there, to ensure the formula results an integer value :
Code:
printf("%s \n", [red]int([/red]vector[2,(NR-1)/2+1[red])[/red] ])
FedoEx said:
and I assign this as bash variable so I can call it later in another awk ' {...} ' block to do
some additional arithmetic operations.
This not sounds viable.

If you want to execute [tt]bash[/tt] commands to set environment variables for alter use, then forget it. That [tt]bash[/tt] will be a sub-process and anything you set in that will vanish as that process terminates.

But is not impossible.
Code:
[blue]master #[/blue] awk 'BEGIN{ print sin(66) }'
-0.0265512

[blue]master #[/blue] ret="$( awk 'BEGIN{ print sin(66) }' )"

[blue]master #[/blue] echo "$ret"
-0.0265512
Or :
Code:
[blue]master #[/blue] read ret <<EOP
[blue]>[/blue] $( awk 'BEGIN{ print sin(66) }' )
[blue]>[/blue] EOP


Feherke.
 
Hi

Ouch. Now I observed how stupidly I misplaced the [tt]int()[/tt] call in my previous post. Should be :
Code:
printf("%s \n", vector[2, [red]int([/red](NR-1)/2+1[red])[/red] ])

Feherke.
 
Hello Feherke,
Here is what I meant at the end of my last post.
Let take your [tt]mawk[/tt] code. It will always generate 3 columns of numbers (with odd number of rows) for example this (3 columns) x (5 rows) array
-2.31 2.3e-02 7.8e-04
-2.32 8.6e-03 4.8e-04
-2.33 1.9e-03 2.1e-04
-2.34 1.2e-03 1.2e-04
-2.35 7.5e-04 1.6e-04

I have my [tt]awk'{...}'[/tt] block which can extract the first number of the row that is right in the middle. In the example above that would be -2.33.

Lets take another [tt]awk'{...}'[/tt] block that extracts some other number from the original input (the one at the very beginning of this thread) data file
Code:
  awk '/ '$SN1'/,/Peak/ {print}" $1|awk 'END{print $3}'

So the way I've been doing things is I glue the different [tt]awk'{...}'[/tt] blocks with [tt]bash[/tt]
Here is the simplest example I could come up with.

Code:
#!/bin/bash
#!/usr/bin/mawk -f
BEGIN{......and the rest of your code above.....
…...............}> /tmp/sample.dat
#that will create that  example (3 columns)  x (5 rows)   array 
#then I define the bash variable
VAR1=`awk '{     for (x = 1; x <= NF; x++)
                  vector[x, NR] = $x
                  }
                  END {
                   printf("%s %s\n" , int(vector[1,(NR-1)/2+1 ]))
             }' /tmp/sample.dat `
#that will assign to $VAR1 the value [b]-2.33[/b].

#then I have the third block 
SN1="#S $2 "
VAR2= `awk '/ '$SN1'/,/Peak/ {print}" $1|awk 'END{print $3}'`
#this one will extract the number   [b]-4.13461[/b] from the original input data file 
# Peak at -4.13461 is 442 COM at ….. 

#then I display the extracted array and variables
cat /tmp/sample.dat
#substract the two numbers
echo $VAR1-$VAR2|bc -l 
# or do some other manipulations with $VAR1 $VAR2 and sample.dat
I have quite a few of that kind of “kindergarten” scripts.
I need to get rid of my [tt]bash [/tt] glue and start using pure [tt]awk[/tt].
Can you show me how to combine these three [tt] awk ' {}'[/tt] blocks into one without [tt]bash[/tt].
That would open a new wide horizon for me.
Thanks.
 
Hi

FedoEx said:
Let take your mawk code.
Let us just call it [tt]awk[/tt] code. [tt]mawk[/tt] was just the last interpreter I tested the code with. Then I forgot to remove the "m" before posting. When you say [tt]mawk[/tt] code, people tends to think to [tt]mawk[/tt]-specific code. However that is just standard [tt]awk[/tt].

As far as I understand :
Code:
#!/usr/bin/awk -f

BEGIN {
  ARGC--
}

$0~("#S " ARGV[2] " "),/Peak/ {
  if (!/[f-z]|#|^$/) {
    vector[++nr,1]=sprintf("%-2.5e", $1)
    vector[nr,2]=sprintf("%-1.5e", $NF/$(NF-1))
    vector[nr,3]=sprintf("%1.5e", $NF/$(NF-1)*sqrt(1/($NF) + 1/($(NF-1))))
  }
  if ($1=="Peak") peak=$3
}

END {
  printf("var1 : %s\n", mistery=vector[1, int((nr-1)/2+1)])
  print "var2 :",peak
  print "var1-var2 :", mistery-peak
}

Feherke.
 
Great.
That will get me started.
Thanks.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top