Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Trying to open a file that is 5 gig large

Status
Not open for further replies.

philyq1

Programmer
Sep 8, 2004
4
0
0
US
I am trying to open a file that is over 5 gig and I get the follwoing error. Does anyone know the limitations on opening a file or how to get around it without splitting the file.

ERROR Code 2: Opening Input File RE: Value too large for defined data type
 
Try to open by tailing or heading:

from top of file use:
head -200000 input file

from bottom of file use:
tail -200000 input file
 
Did your try to use 'head' or 'tail' and redirect the output to another file? You may need to get a portion of file by these command to overcome the limitation.

tikual
 
Thanks tikual and hello99 but is there a way without splitting the file?
 
philyq1 - Apologies if I've misunderstood; are you trying to read or slurp the whole file into memory? You shouldn't have any trouble processing the file as long as you don't try and do it all at once.

If it's a text file - like this:

open(RE,"big_file");
open(OUTPUT_FILE, ">copy_of_big_file");
while(<RE>){
$line = $_;
# do something with $line
print OUTPUT_FILE $line;
}
close(RE);
close(OUTPUT_FILE);


Mike

"Deliver me from the bane of civilised life; teddy bear envy."

Want to get great answers to your Tek-Tips questions? Have a look at faq219-2884

 
MikeLacey,

That is what I am trying to do but since the file is over 5 gig, when I use the open function it gives me the following error. What do you mean by slurp into memory?

ERROR: Opening Input File RE: Value too large for defined data type

Here is the code I am using

open(DATASOURCES,$file[2]) || die "ERROR: Opening Input File RE: $!\n";
open(NEWDATASOURCES,">/apps/dw1/prod/pscognosr4/data/02524/datasources/output_file.txt") || die "ERROR: Opening Outputfile RE: $!\n
";

while (defined($record=<DATASOURCES>))
{
@input_line=split(/\|/,$record);
if ($input_line[0] == "2003-06-01")
{
print NEWDATASOURCES ("$record");
}

}
 
Hmmmm

How long is the first line? How many characters are there before the first newline character I mean.

And - does this script give you an error?

open(DATASOURCES,$file[2]) ||
die "ERROR: Opening Input File RE: $!\n";
while (<DATASOURCES>)){
# an empty loop, just to read each line in the file
}

Mike

"Deliver me from the bane of civilised life; teddy bear envy."

Want to get great answers to your Tek-Tips questions? Have a look at faq219-2884

 
Mike,

It fails at the open step. The rest of the program does not even get executed. I tried this same program just heading the first 100 lines and it works. Just curious if anyone is able to tell me how to open a 5 gig file without splitting it before hand.
 
In looking around, I read reports of a 2G file limitation in PHP generating the same error. One suggestion was to open a pipe to cat to read in the file, as it should be able handle larger sizes. So instead of

open F, $filename or die $!;

try

open F, "cat $filename |" or die $!;

This is assuming a *nix system, I don't have quite as much faith in DOS' "type" but it'd be worth a shot.

________________________________________
Andrew
 
Phily - what version of Perl are you using?

Mike

"Deliver me from the bane of civilised life; teddy bear envy."

Want to get great answers to your Tek-Tips questions? Have a look at faq219-2884

 
Since 5.6.1 large file (>2gb) has been builtin to Perl - from the documentation.

PerlDocumentation said:
Large file support
If you have filesystems that support "large files" (files larger than 2 gigabytes), you may now also be able to create and access them from Perl.

NOTE: The default action is to enable large file support, if
available on the platform.




If the large file support is on, and you have a Fcntl constant O_LARGEFILE, the O_LARGEFILE is automatically added to the flags of sysopen().

Beware that unless your filesystem also supports "sparse files" seeking to umpteen petabytes may be inadvisable.

Note that in addition to requiring a proper file system to do large files you may also need to adjust your per-process (or your per-system, or per-process-group, or per-user-group) maximum filesize limits before running Perl scripts that try to handle large files, especially if you intend to write such files.

Finally, in addition to your process/process group maximum filesize limits, you may have quota limits on your filesystems that stop you (your user id or your user group id) from using large files.

Adjusting your process/user/group/file system/operating system limits is outside the scope of Perl core language. For process limits, you may try increasing the limits using your shell's limits/limit/ulimit command before running Perl. The BSD::Resource extension (not included with the standard Perl distribution) may also be of use, it offers the getrlimit/setrlimit interface that can be used to adjust process resource usage limits, including the maximum filesize limit.

Mike

"Deliver me from the bane of civilised life; teddy bear envy."

Want to get great answers to your Tek-Tips questions? Have a look at faq219-2884

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top