Joining multiple lines 1

GoBillsBN · Jul 3, 2001

Can anybody help me with a program to join lines together? Here's kinda what I have:

>(14-3-3)
CCACGCGTCCGCCTTGG
GCTGTCTTTGTATGACT
CTGGTCCACAATCCCTT
>(6CKine)
CTTGTCCTGGTCCTGGC
TCAGGTACAGCCGAAAG
TTGCGCTATGCCAGCTA

....and so on. Here's what I want it to look like:

>(14-3-3)
CCACGCGTCCGCCTTGGGCTGTCTTTGTATGACTCTGGTCCACAATCCCTT
>(6CKine)
CTTGTCCTGGTCCTGGCTCAGGTACAGCCGAAAGTTGCGCTATGCCAGCTA

Basically I want to join all the lines with the DNA sequence into one line.

I couldn't think of anyway to accomplish this....but I was just introduced to awk this week....so let me know if anyone has any ideas.

Thanks!

marsd · Jul 3, 2001

Maybe some of the other guys can help you more, but this is very basic.
Assuming that your line starts with either
a ">" or a parentheses "(" , then you could
write this.
awk ' {
if ($0 ~ /^\(/) {
++s_lines
} else {
++code
}
arr[s_line] = $1
arr

Code:

 = $2 
(The reason I made this an array variable is that I was thinking you could manipulate the data more easily this way later, but you don't have to)
for now...
printf &quot;%s&quot; , arr[s_line] , arr[code]
}' file 

This will give you the ()followed by code and then the next()and the intervening code.
(hopefully, it did for me.)

marsd · Jul 3, 2001

What happened there?
arr

Code:

, #2 var.

Krunek · Jul 4, 2001

Hello, GoBillsBN!

You can try this awk solution:

/^>/ { # finds line with > at the beginning of the line
if (dnaRow != "&quot

{
print dnaRow
dnaRow = ""
}
print # prints line with >
next # skips to the next line
}

{ dnaRow = dnaRow $0 }

END { print dnaRow }

Congratulations, GoBillsBN! Awk is a good choice.

Bye!

KP.

grega · Jul 4, 2001

You could try this, which is simple and makes use of nawks pattern matching capability.

nawk '/[CGTA]$/ {printf("%s",$0)} /^\>/ {printf("\n%s\n",$0)} END {print}' dnafile

Greg.

GoBillsBN · Jul 4, 2001

Thanks for the help everyone, all of your suggestions seem to work great for me!

Awk was definently a good choice for me for these kind of uses. I'm still learning though, so can anyone suggest any helpful websites or books to learn awk from?

Krunek · Jul 4, 2001

Hi again, GoBillsBN!

Please, see this answer:
good awk/sed reference book needed

Also you can check O'Reilly's site. There is a good awk reference text, "Chapter 11 The awk Programming Language:"

http://www.oreilly.com/catalog/unixnut3/chapter/ch11.html

My awk referrence is an old overview article by the original authors of Awk, because I use classic awk; see this site:

http://www.crossmyt.com/hc/htmlchek/awk-perl.html

Awk is beatiful and simple scripting language.

Bye!

KP.

flogrr · Jul 5, 2001

Hi GoBillsBN,

Here is my two cents worth!

awk '

/^>\(/ {

if ( flag ) printf ("\n&quot

print
flag = 1
}

/^[A-Z]+$/ { printf ("%s", $0 ) }

END { printf ("\n&quot

}' inputfile > outputfile

flogrr
flogr@yahoo.com

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Joining multiple lines 1

GoBillsBN

Technical User

marsd

IS-IT--Management

marsd

IS-IT--Management

Krunek

Programmer

grega

Programmer

GoBillsBN

Technical User

Krunek

Programmer

flogrr

Programmer

Similar threads

Part and Inventory Search

Sponsor