PHV wrote the following script to process the inputfile at the end of this post(it will work for single line process, the problem I have is to extract the last part)
awk '
BEGIN{print "loop_beg\tmature_arm\tpri_id"}
$1~/^(loop_beg|mature_arm)$/{a[$1]=$2}
$1=="pri_id"{print a["loop_beg"]"\t"a["mature_arm"]"\t"$2}
' /path/to/input
--------------------------------------------------------
input file
nucleus -0.6
star -1.3
score_randfold 1.6
score_mfe 3.1
score_freq 0
score 3.1
flank_first_end 41
flank_first_seq kkkkkk
flank_first_struct ........
flank_second_beg 107
flank_second_seq fffffffffffff
flank_second_struct ..................
freq 62
loop_beg 60
loop_end 88
loop_seq llllll
loop_struct ccccccccccccccc
mature_arm second
mature_beg 89
mature_end 106
mature_query GC40_484005_x31
mature_seq CAGAGCTGGCTGAAGGGC
mature_strand +
mature_struct ........
pre_seq ttttt
pre_struct ....................
pri_beg 1
pri_end 140
pri_id chr11_1559
pri_mfe -84.14
pri_seq ...........
pri_struct kkkkkkkkkk
star_arm first
star_beg 42
star_end 59
star_seq CCTTCAGCCAGAGCTGGC
star_struct fffff
GC40_1014_x 18 20
GC40_1014_x 18 21
GC40_484005_x 18 22
-----second record coninues----
I want to print $1 and $3 of the last part(GC40-----). Each record have more than one entry.
Output should be something like this
header1 header2 header3 header4
$2 $2 $1 $3
$1 $3
$1 $3
$2 $2 $1 $3
Help please!!!
awk '
BEGIN{print "loop_beg\tmature_arm\tpri_id"}
$1~/^(loop_beg|mature_arm)$/{a[$1]=$2}
$1=="pri_id"{print a["loop_beg"]"\t"a["mature_arm"]"\t"$2}
' /path/to/input
--------------------------------------------------------
input file
nucleus -0.6
star -1.3
score_randfold 1.6
score_mfe 3.1
score_freq 0
score 3.1
flank_first_end 41
flank_first_seq kkkkkk
flank_first_struct ........
flank_second_beg 107
flank_second_seq fffffffffffff
flank_second_struct ..................
freq 62
loop_beg 60
loop_end 88
loop_seq llllll
loop_struct ccccccccccccccc
mature_arm second
mature_beg 89
mature_end 106
mature_query GC40_484005_x31
mature_seq CAGAGCTGGCTGAAGGGC
mature_strand +
mature_struct ........
pre_seq ttttt
pre_struct ....................
pri_beg 1
pri_end 140
pri_id chr11_1559
pri_mfe -84.14
pri_seq ...........
pri_struct kkkkkkkkkk
star_arm first
star_beg 42
star_end 59
star_seq CCTTCAGCCAGAGCTGGC
star_struct fffff
GC40_1014_x 18 20
GC40_1014_x 18 21
GC40_484005_x 18 22
-----second record coninues----
I want to print $1 and $3 of the last part(GC40-----). Each record have more than one entry.
Output should be something like this
header1 header2 header3 header4
$2 $2 $1 $3
$1 $3
$1 $3
$2 $2 $1 $3
Help please!!!