Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Extract records listed in one file from another file

Status
Not open for further replies.

jizosaves

Technical User
Apr 7, 2009
1
Hi I have this dilemma.

I have an input file created with the form of

gnl|ti|1302443149
gnl|ti|1302443148
gnl|ti|1302443142

lets call it INPUT

I need to retrieve these records from another file, lets call it REF, in the form of :

...
ATAAGCAATCAACATTCCAAGCTTTGTTCTTAACTTTACCCCATCCCCTTCCACCACCGATTTTCTTGTT
ACATGTGTAATGTATTACCTATCAACTATTTCCTGTATGTGCTCTCTACCTTTACTTTCATGTTTCGTTT
AGTAAAATAAGT
>gnl|ti|1302443140 1101127226940
GTAAGCTGCAGGATTTATTGTTGTTAATCATAGAATATTTTGAGTACTCACCTCCGATATCTATCGGTGG
CGCGTTCATGCCACACTGGCCGGTGCCGTAATGTGGGGCAAGAAACATTCCCAATGAGTTTAAAAACGTC
ATCCGCGGCATACAAGAGCACAAGACGCGCCCTAAGTCGGATACCCAGCACTTTGGAGCACCCAACTCGC
ACACCCAAGTGACCCAACGCGGTAGCGGCGCCCCAAAGAGCACGTCGCCGACATTTCTATCCACTCAGCA
TAGCCGCGGAACCTGATGCTTGCTAGATGGCGCCGCAGTGTTCAAAATGCCGCTACCCAAGATAAAATGG
TCTATACTCGTGGAGGGCTCGCTACCGCAGGTGCTGCCATGCAGCTCTTGCTCGTGGTGGCGCCACCGCC
ATCACCTGTGCTCCCTGTGATCCCAGTCGAAGTGGTGACAGGGATTCTGGGAAGGCTCGTTGCCTCACGT
GCTGCTACGGAGCCCTGGCTCGTTAGGAGTGCTGTTGCTACTGCCAAAGCTTGTGCTCTTTGTCGAAACG
GCGACTGGAATCCGGAGACGGTTCGCTGCCGCAGCCGGTCGGAAAGTTCCGCGCGTTCGGTCGTAGCCCT
TCTTGGGAATTCGTTCCCTCCATGTTGAGGCTCTAGCAGGAGAAATCCACTGCTGACGTGTTTGTTACTT
GTCACCATGGAACTCAATCCGCAGAGGGCCATGGCCACGTGATCACACTTCTGAAATGTTAGAACCCTAC
TGAAAAAATTAACAAGGCGGTAAGTTGGGCTAGTTGGTTTAACATCGTCAAACGTGAAACAGCGCAAAAA
TTGAACAAAGGACAGTGAAAGGAGTGCACGGACAAGCGCTGTTAAAGCTGGACACGCGCATCGAACACGC
GACACTTGCATACAATCGGGCATCGGTTATCGGGCAGCGTACGGCT
>gnl|ti|1302443141 1101127226941
AATAGATAACAGAGGTGCAGATATGATGGGGCAGAACGGTTGTCCGGTCGGCGAATCTCAACTGGACTAA
AGGCCGATCACGACTGCAGCAACTGCAGCATGGATGTTTGGGAGTCGGCTCGTTTTCCCCAAGTCCCTAG
GTAGGGAATTCGAAGCCGCAGTTGGAAACCAGCAAGCCCCGCCTCTGTTCCATTCGATACACACATATTC
GCTCCTGCAAAGCCGCGCGAAAGCTCTGCCGTCAATCGAAAAGTAAAGACGGCGCCGGGGAGACAAGGAG
TAGTGGGCGCCTTTCCTAAAATATGTCCCGCCACCCTAAGTTGAAACGGCATTGTATACAAATAAATGCC
TACGGCGTCGGCTTGAGGACCCCGTGTAAGCAGCCTCCGGCCCTTAGAGTGCTCCTACCGTTTATCTTTC
TTTTATTAGCTTCCCGCCATGAGAAGTCGTACCGCAGGGTATGCCCCT...

this is DNA shotgun sequencing raw trace data.

I need to extract the records listed in INPUT from REF and compile the dna sequences in OUTPUT file.

i am pretty new to this sort of stuff but managed to use GREP to get my input file from sopme raw data.

Any help in this would greatly indebt me to you
Thanks
jizosaves

 
Hi

You not defined what is a "record". In meantime try this :
Code:
awk 'FNR==NR{d[$0]=1;next}substr($1,1,1)==">"{w=substr($1,2) in d}w' INPUT REF
Tested with [tt]gawk[/tt] and [tt]mawk[/tt].

Feherke.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top