Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Copy and insert header into respective lines

Status
Not open for further replies.

GiovanniC

Technical User
Nov 5, 2001
15
AU
Hi all,

I have the following input file:

START
Query= TaBa0-000001
>gnl|UG|Ta#S13248438 ug=Ta.24021
>gnl|UG|Ta#S17880947 ug=Ta.8115
>gnl|UG|Ta#S17984398 ug=Ta.29549
Query= TaBa0-000002
>gnl|UG|Ta#S17988614 ug=Ta.30772
>gnl|UG|Ta#S16202639 ug=Ta.25603
>gnl|UG|Ta#S13134499 ug=Ta.16563
>gnl|UG|Ta#S16238874 ug=Ta.10196
Query= TaBa0-000003
>gnl|UG|Ta#S17987015 ug=Ta.28640
>gnl|UG|Ta#S17897903 ug=Ta.27673
END

What I'd line to do is take the Query= header and place it in the same line as the >gnl numbers under it to get the following output format:

START
Query= TaBa0-000001 >gnl|UG|Ta#S13248438 ug=Ta.24021
Query= TaBa0-000001 >gnl|UG|Ta#S17880947 ug=Ta.8115
Query= TaBa0-000001 >gnl|UG|Ta#S17984398 ug=Ta.29549
Query= TaBa0-000002 >gnl|UG|Ta#S17988614 ug=Ta.30772
Query= TaBa0-000002 >gnl|UG|Ta#S16202639 ug=Ta.25603
Query= TaBa0-000002 >gnl|UG|Ta#S13134499 ug=Ta.16563
Query= TaBa0-000002 >gnl|UG|Ta#S16238874 ug=Ta.10196
Query= TaBa0-000003 >gnl|UG|Ta#S17987015 ug=Ta.28640
Query= TaBa0-000003 >gnl|UG|Ta#S17897903 ug=Ta.27673
END

Would anyone have an idea how I might be able to do this?

Thanks for any help.

Cheers,

Giovanni
 
Something like this ?
awk '
/^Query=/{q=$2;next}
$1!~/START|END/{print q,$0}
' /path/to/input

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
Hi PHV,

Thanks for the suggestion. It worked well mostly, but I just noticed one important thing (for me anyway). In cases where the Query= string is not followed by the string starting with >gnl on the next line, the Query= header is not printed. ie if I have this situation:

Query= Taba0-00001
Query= Taba0-00002
>gnl|UG|Ta#S13248438 ug=Ta.24021
>gnl|UG|Ta#S17880947 ug=Ta.8115
>gnl|UG|Ta#S17984398 ug=Ta.29549
Query= Taba0-00003

The output does not include Taba0-00001 and Taba0-00003. How do I go about modifying the script so that they don't get left out?

Giovanni
 
you may try this:
awk '
/^Query=/{if(t)print q;q=$0;t=1;next}
$1!~/START|END/{t=0;print q,$0;next}1
END{if(t)print q}
' /path/to/input

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top