Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Maximum file size when using FGETS 3

Status
Not open for further replies.

Gary Sutherland

Programmer
Apr 18, 2012
32
GB
Large CSV files is a topic that has cropped up before, I know, but I can't find a reference to this particular problem.

I'm writing a tool import address data from the UK PAF.CSV postcodes/addresses file. This is a CSV file containing (at present) 31,827,747 individual UK postal addresses. I'm opening the file with FOPEN and reading each line with FGETS.

To get around the maximum size limit for a DBF I'm writing these to a set of ten identical tables in blocks of 3,500,000 records which is working well. The problem I'm encountering is when it reaches line 25,214,532 the FGETS generates an error that the file is too big. This is when it's already 714,531 to table 8 which should still have room for another 2,785,468 records. Remember there's still 7,000,000 capacity in tables 9 and 10.

So, this appears to be a limitation of the FOPEN/FGETS functionality in VFP.

Short of using something to split the PAF.CSV file into two files and importing them sequentially, I'm open to suggestions.

Regards
Gary
 
Gary Sutherland said:
The software is finished and on-site.

Good. I assume now it was just a one time job to read in that CSV and since you found the Huge CSV splitter tool before diving into alternatives for reading large files you solved it that way. No need to solve something twice that's done, of course. I recognize the legal aspect discussion, thanks for that advice, Colin. Gary addressed that already, no need to follow up.

Technically, I can now recommend to use ReadFile. What's necessary to use that is first getting the hFile file handle from another API function: CreateFile.

Sounds implausable, but it should rather be named CreateFileHandle, among the parameters you can specify to only need read access and open exsiting files only, see details in the documentation:

CreateFile documentation said:
[in] dwDesiredAccess

The requested access to the file or device, which can be summarized as read, write, both or 0 to indicate neither).

The most commonly used values are GENERIC_READ, GENERIC_WRITE, or both ([highlight #DDDDDD]GENERIC_READ | GENERIC_WRITE[/highlight]).

CreateFile documentation said:
[in] dwCreationDisposition

An action to take on a file or device that exists or does not exist.
...

OPEN_EXISTING (3) - Opens a file or device, only if it exists. If the specified file or device does not exist, the function fails and the last-error code is set to ERROR_FILE_NOT_FOUND (2).

Finally you also need CloseHandle

And that makes it like the triple of FOPEN, FREAD and FCLOSE, just with API functions instead of native VFP functions. Available in Windows 2000 or later, so practically everywhere, even in XP scenarios.
To get something similar to FGETS, reading lines of a file you do that with ALINES of the blocks you read. Since you need to have some constants defined I'd implement that as a class (custom) for LargeFiles and round it up with WriteFile and further API functions of the family of handleapi, then define methods as the interface to VFP usage.


Chriss
 
No problem, Colin. I'm constrained in part by the application I have to integrate this with.

Regards
Gary
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top