Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Output generation with loop on the fly

Status
Not open for further replies.

JohnLucania

Programmer
Oct 10, 2005
96
US
Code:
while ( my @rids = $fac->each_rid ) {
   die "No record found\n" unless @rids;

  foreach my $rid ( @rids ) {

    my $result = $fac->retrieve_blast( $rid );

    if( ref( $result )) {
      my $output   = $result->next_result();
      my $filename = $output->query_name().".out";
      $filename =~ s/.*?\|(\w*).*/$1.out/;
      $fac->save_output( $filename );
      $fac->remove_rid( $rid );
    }
    elsif( $result < 0 ) {
      $fac->remove_rid( $rid );
    }
    else {
      sleep 5;
    }
  }
}

The output from my $filename = $output->query_name().".out"; generates one big file. I like to break it down to multiple files.

>ref|XP_449030.1|
.......
>ref|XP_456080.1|
.......
>emb|CAG86374.1|
.......

The output always starts with '>' and then a word and then | (pipe).

I want multiple files like:

XP_449030.out
XP_456080.out
CAG86374.out
etc

I tried $filename =~ s/.*?\|(\w*).*/$1.out/;
but not working.

How do you generate multiple files on the fly?

jl
 
what is your $fac object?

Mike

I am not inscrutable. [orientalbow]

Want great answers to your Tek-Tips questions? Have a look at faq219-2884

 
ok, $fac is:

Code:
my $fac = Bio::Tools::Run::RemoteBlast->new( -prog       => 'blastp',
 -data       => 'nr'       ,
 -expect     => $evalue    ,
 -readmethod => 'SearchIO' );

my $gb_dbh = Bio::DB::GenBank->new(-format => 'fasta');

my $str = $gb_dbh->get_Seq_by_acc( $accession );

print "submitting BLASTs";

$fac->submit_blast( $str );
print '.';
print "done.\n";
 
are you using "strict" ? Almost sounds like a scoping problem, but I can't tell by looking at the code you have posted.
 
Yes, I have it.
Code:
#! /usr/bin/perl -w

use strict;
use warnings;
$|++;

use Bio::Perl;
use Bio::Tools::Run::RemoteBlast;
use Bio::Search::Result::ResultI;
use Bio::Search::Hit::HitI;
use Bio::Search::HSP::HSPI;
use Bio::DB::RandomAccessI;
use Bio::DB::Query::GenBank;
use Bio::DB::GenBank;
use Bio::SeqIO;
use Bio::DB::SeqI;
use Bio::Seq::RichSeq;

print "Enter Accession Number:  ";
my $accession = <STDIN>;
print "\n";
print "Enter e-value:  ";
my $evalue = <STDIN>;

[pc1]
 
with #use strict; I am getting the one big file still.
So, I guess use strict; shouldn't be an issue here.


[ponder]
 
My guess is the line
Code:
$fac->save_output( $filename );
is where your 'problem' is. It would appear that method is designed to output all the results to one file.

There are a couple options: 1)Parse the output file and split it into all the additional files you're looking for. 2) See if there is another method that will return the data or a reference to the data/data structure. You could then parse that and output the results any way you want without the temporary file.
 
How about something like:
Code:
open INPUT, "<temp_output.txt" or die;

FILENAME: while (<INPUT>) {
    if ($_ =~ /^>\w+\|(.+)\..*\|/) {
        open OUTPUT, "> $1.out" or die "Could not open $1.out\n$!";
        do {
            print OUTPUT $_;
            $_ = <INPUT>;
        } until (eof INPUT || $_ =~ /^>\w+\|.+\..*\|/);
        print OUTPUT $_ if eof INPUT;
        close OUTPUT;
        redo FILENAME;
    }
}

close INPUT;
 
I opt to use the existing modules rather than reinventing wheels.

Code:
print "writing results";

our $filename;

while ( my @rids = $fac->each_rid ) {
   die "No record found\n" unless @rids;

  foreach my $rid ( @rids ) {

    my $result = $fac->retrieve_blast( $rid );

    if( ref( $result )) {
      my $output   = $result->next_result();
      $filename = $output->query_name().".out";
      $fac->save_output( $filename );
      $fac->remove_rid( $rid );

	my( $out ) = shift;
	my $in_seqio  = Bio::SeqIO->new( -file => $filename , -format => 'genbank' );
	my $out_seqio = Bio::SeqIO->new( -file => ">$out" , -format => 'fasta' );
        
	while( my $seq = $in_seqio->next_seq() ) {
	   $out_seqio->write_seq( $seq );
	}

    }
    elsif( $result < 0 ) {
      $fac->remove_rid( $rid );
    }
    else {
      sleep 5;
    }
  }
}

I have tried this and am getting:

Use of uninitialized value in concatenation (.) or string at ./blast.pl line 61.

------------- EXCEPTION -------------
MSG: Could not open >: No such file or directory
STACK Bio::Root::IO::_initialize_io /usr/local/lib/perl5/site_perl/5.8.4/Bio/Root/IO.pm:313
STACK Bio::SeqIO::_initialize /usr/local/lib/perl5/site_perl/5.8.4/Bio/SeqIO.pm:451
STACK Bio::SeqIO::fasta::_initialize /usr/local/lib/perl5/site_perl/5.8.4/Bio/SeqIO/fasta.pm:83
STACK Bio::SeqIO::new /usr/local/lib/perl5/site_perl/5.8.4/Bio/SeqIO.pm:354
STACK Bio::SeqIO::new /usr/local/lib/perl5/site_perl/5.8.4/Bio/SeqIO.pm:380
STACK toplevel ./blast.pl:61

Code:
print "writing results";

our $filename;

while ( my @rids = $fac->each_rid ) {
   die "No record found\n" unless @rids;

  foreach my $rid ( @rids ) {

    my $result = $fac->retrieve_blast( $rid );

    if( ref( $result )) {
      my $output   = $result->next_result();
      $filename = $output->query_name().".out";
      $fac->save_output( $filename );
      $fac->remove_rid( $rid );


    }
    elsif( $result < 0 ) {
      $fac->remove_rid( $rid );
    }
    else {
      sleep 5;
    }
  }
}



	my( $out ) = shift;
	my $in_seqio  = Bio::SeqIO->new( -file => $filename , -format => 'genbank' );
	my $out_seqio = Bio::SeqIO->new( -file => ">$out" , -format => 'fasta' );
        
	while( my $seq = $in_seqio->next_seq() ) {
	   $out_seqio->write_seq( $seq );
	}

I put the chuck outside of WHILE and am getting the same error.

Any idea why it is returning the error?

[ponder]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top