Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

More Help, New to awk!!!! 1

Status
Not open for further replies.

demis001

Programmer
Aug 18, 2008
94
US
PHV wrote the following script to process the inputfile at the end of this post(it will work for single line process, the problem I have is to extract the last part)

awk '
BEGIN{print "loop_beg\tmature_arm\tpri_id"}
$1~/^(loop_beg|mature_arm)$/{a[$1]=$2}
$1=="pri_id"{print a["loop_beg"]"\t"a["mature_arm"]"\t"$2}
' /path/to/input
--------------------------------------------------------
input file

nucleus -0.6
star -1.3
score_randfold 1.6
score_mfe 3.1
score_freq 0
score 3.1
flank_first_end 41
flank_first_seq kkkkkk
flank_first_struct ........
flank_second_beg 107
flank_second_seq fffffffffffff
flank_second_struct ..................
freq 62
loop_beg 60
loop_end 88
loop_seq llllll
loop_struct ccccccccccccccc
mature_arm second
mature_beg 89
mature_end 106
mature_query GC40_484005_x31
mature_seq CAGAGCTGGCTGAAGGGC
mature_strand +
mature_struct ........
pre_seq ttttt
pre_struct ....................
pri_beg 1
pri_end 140
pri_id chr11_1559
pri_mfe -84.14
pri_seq ...........
pri_struct kkkkkkkkkk
star_arm first
star_beg 42
star_end 59
star_seq CCTTCAGCCAGAGCTGGC
star_struct fffff
GC40_1014_x 18 20
GC40_1014_x 18 21
GC40_484005_x 18 22

-----second record coninues----

I want to print $1 and $3 of the last part(GC40-----). Each record have more than one entry.

Output should be something like this
header1 header2 header3 header4
$2 $2 $1 $3
$1 $3
$1 $3
$2 $2 $1 $3

Help please!!!
 
It seems like PHV has given you a good start, have you tried modifying the script to do that? If so, where are you stuck?

Annihilannic.
 
Many thanks,
I have tried to insert sort of if statment and really doesn't work. Here is what I have tried;

awk '
BEGIN{print "loop_beg\tmature_arm\tpri_id\tquery"}
{if($1~/^(loop_beg|mature_arm)$/a[$1]=$2})
{if($1~/^(GC40)/a[$1]=$1}
$1=="GC40"{print a["loop_beg"]"\t"a["mature_arm"]"\t"$2\t a["GC40"]"\t"$1}
' /path/to/input
 
You don't need the if statements because each awk expression is effectively an if statement already, of the format expression { code }. Whenever the expression is true, the code is executed.

Try this slight variation:

Code:
aawk '
    BEGIN {print "loop_beg\tmature_arm\tpri_id\tquery"}
    $1~/^(loop_beg|mature_arm)$/ { a[$1]=$2 }
    $1~/^GC40/ {
        print a["loop_beg"]"\t"a["mature_arm"]"\t"$2"\t"$1"\t"$3
        a["loop_beg"]=""
        a["mature_arm"]=""
    }
'  /path/to/input

Annihilannic.
 
Many Many Thanks! You guys teached me and guid me a lot!
 
Hi

Here on Tek-Tips we used to thank for the received help by giving stars. Please click the

* [navy]Thank Annihilannic
for this valuable post![/navy]


at the bottom of Annihilannic's post. That way you both show your gratitude and indicate this thread as helpful.

Feherke.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top