Problems with printing out to a file

posdef · Mar 11, 2010

Hi,

I am having problems with a script that should write out to a file. Weird enough the file is generated but it's just empty.. I have tried to write out to terminal and that works fine.. I have tried a much simpler script to write out to a file and that works fine too... I am missing something small, but just can't put my finger on it.. Would really appreciate any help.

Below are the parts of the code that are relevant:

Code:

#!/usr/bin/env perl
use warnings;
use strict;
use Data::Dumper;

...

open my $ifh, "xref-unip_only.HUMAN.csv" or die "could not open input file $!";
open my $ofh,">ref_table.txt" or die "could not open output file $!";

while(<$ifh>){
	$num_read++;
	my $line = $_;
	my $new_line;
	chomp($line);
	my ($up_id, $up_short, $ipi, $suppl, $gene) = split(/\t/, $line);
	$up_id = substr(substr($up_id,1),0,-1);
	$ipi = substr(substr($ipi,1),0,-1);	
		
	my $kegg;
	
	eval{$kegg = acquire_kegg($serv, $up_short);};# try getting kegg_id
	if($@){...}

	if ($kegg ne ""){
		my ($up, $kegg_id, $comm) = split(/\t/, $kegg);
		$new_line = "$ipi\tup:$up_id\t$kegg_id";
		$num_written++;			# id converted to KEGG
	}
	else{$new_line = "$ipi\tup:$up_id\t$kegg";}	
	print $ofh "$new_line\n";
#	print "$new_line\n";
}
	
close $ifh;
close $ofh;

sub acquire_kegg {...}

Annihilannic · Mar 11, 2010

I also tried a simplified test and it seems to work fine.

Are you certain that part of the code only runs once? i.e. are there other loops around it which could cause the opens to happen again, thereby clobbering the file? You could try changing the >ref_table.txt to >>ref_table.txt to confirm or refute that theory.

Annihilannic.

posdef · Mar 11, 2010

well there are no other loops, the codes I left out (places with ...) are simply the parts that refer to a specific server and have pretty much nothing to do with the printing part. Since the script prints just fine to the terminal, I presume that there's nothing wrong with that part of the code.. What could cause it to not write to a file, when it can print out just fine to the terminal??

Append (>>) doesn't work either, btw..

posdef · Mar 12, 2010

Quick update: I have checked the permissions for the file, tried to add the print line right after the open statement, still nothing..

Code:

open my $ofh,">output" or die "could not open output file $!";
print $ofh "IPI-Uniprot-KEGG Id. conversion table\n";

and finally i have the following simple script which does pretty much the same thing; read from a file and write to another one, and the worst thing is that one works just fine..

Code:

#!/usr/bin/env perl
use warnings;
use strict;
use Data::Dumper;

open (my $input, 'test.txt');
open (my $fh, '>output.txt');

while(<$input>){
	my $line =$_;
	chomp($line);
	print $fh "a line of text\n";
	print $fh "$line\n";
	
}
close($fh);

This is driving me crazy.. Computer programs should make sense right?

Annihilannic · Mar 14, 2010

Try declaring your file handle variables earlier... say:

Code:

#!/usr/bin/env perl
use warnings;
use strict;
use Data::Dumper;

my $ifh;
my $ofh;

...

open $ifh, "xref-unip_only.HUMAN.csv" or die "could not open input file $!";
open $ofh,">ref_table.txt" or die "could not open output file $!";

...

Annihilannic.

stevexff · Mar 15, 2010

I know this is going to sound really stupid, but has xref-unip_only.HUMAN.csv got any records in it? If not, it would certainly explain your results...

Also, when you run the first code, does the date and time stamp of the output file change?

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object:erlDesignPatterns)[/small]

posdef · Mar 15, 2010

Thanks guys for your interest and help, I have solved the problem over the weekend. I believe the biggest problem was the latency in the remote server, and for some reason the script prints out faster to the terminal than to a file..

Sorry for making such a fuss about it.. In retrospect, I should perhaps been more patient during debugging..

Best regards,

Annihilannic · Mar 15, 2010

Is it a long-running script? If so, the output could be buffered, and may not actually get written to the file until the buffer is full (or the script finishes).

Annihilannic.

stevexff · Mar 15, 2010

Seems strange - the close() will flush the buffers...

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object:erlDesignPatterns)[/small]

posdef · Mar 16, 2010

Well the total execution time is around 24 hours

however majority of it should be on the remote server side. However buffering does make sense.

the close() comes last in the script pretty much before the termination.

stevexff · Mar 16, 2010

24 hours? How many billion records are you processing? Or is fixing the performance of acquire_kegg() next on your to-do list?

And why does this have to be evaled?

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object:erlDesignPatterns)[/small]

posdef · Mar 17, 2010

Well it's more like tens of thousands queries that go via SOAP. The process is slow on the remote part which I guess why it takes such a ridiculous process time.

Acquire sub is separated from the main and appears in an eval mostly because every once in a while the remote server sends a connection reset error (not exactly sure why, they apparently didn't feel the need to mention that on their documentation).

I am definitely open for suggestions, as I am quite new to Perl, but since the code does what it's supposed to as it is now, I don't think I'll be putting a lot of time to improve the code.

stevexff · Mar 18, 2010

OK, so we've established that your process spends the majority of its time waiting for network I/O. If you need to speed it up (for example if it gets to the point where a daily run takes more than 24 hours) then you would need to split the input file and run multiple processes, or come up with some kind of threaded solution. All of these add more complexity, and if you are happy with the performance, then you clearly don't need to do it.

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object:erlDesignPatterns)[/small]

posdef · Mar 18, 2010

I'll see if it needs improvements in the near future, as for know I think it's fine as it is. Since the script is meant to be run once or twice a month.

What would be interesting is to read a newer input file, and update the output file by checking it to the net service without really going through the whole 24h process.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Problems with printing out to a file

posdef

Technical User

Annihilannic

MIS

posdef

Technical User

posdef

Technical User

Annihilannic

MIS

stevexff

Programmer

posdef

Technical User

Annihilannic

MIS

stevexff

Programmer

posdef

Technical User

stevexff

Programmer

posdef

Technical User

stevexff

Programmer

posdef

Technical User

Similar threads

Part and Inventory Search

Sponsor