Parsing a Tab-Delimited File 1

shawnwmorris · Jun 6, 2017

I have been attempting with VFP 9.0 to parse a tab-delimited file. The file contents look like this:
Policy 1 LocNum 1 Address 1 City 1 State 1 Zip 1
Policy 2 LocNum 2 Address 2 City 2 State 2 Zip 2
Policy 3 LocNum 3 Address 3 City 3 State 3 Zip 3
Policy 4 LocNum 4 Address 4 City 4 State 4 Zip 4
Policy 5 LocNum 5 Address 5 City 5 State 5 Zip 5

I am trying to learn how to access individual "fields" in the data

My code doesn't work correctly (not suprised, I am a newbie) My attmepts give me all of the data for 1 row repeated for the number of rows in the file. Here is my code:

Code:

IF ALINES( laFile, FILETOSTR("c:\temp\abc.txt")) > 0  && Load to array, by line. If textfile is not empty   
	FOR lnLine = 1 TO ALEN( laFile,1)  && Do for each line of the file
		 ?laFile(lnLine,2) 
	ENDFOR
ENDIF

I would appreciate any help to point me in the right direction.

Tore Bleken · Jun 6, 2017

See Import and Append in Help.

For kicks, also try the Import Wizard. Tools->Wizards->Import.

shawnwmorris · Jun 6, 2017

The Wizards won't work for this scenario, I am trying to use the data to create some custom SQL statements.

Scott24x7 · Jun 6, 2017

Shawn,
If the data truely lines up that way, skip the tab delimied, and treat it as fixed length.
Then you can just create the structure you want in a table like:

Type1 - C7
Value1 - I
Type2 - C7
Value2 - I
Type3 - C8
Value3 - I
Type4 - C5
Value4 - I
Type5 - C6
Value5 - I
Type6 - C4
Value6 - I

Or, if all the numeric values will be treated as characters you could make them C1

Then all you have to do is the following (Assuming the table is called "PARSE"

SELECT PARSE
APPEND FROM FILE <textfile.ext> SDF

And you will have all the data in the file. <textfile.ext> is the name of the text file that has your data to be imported in it.

Note that the "text" description fields consume the " " character between them and their data. You could get fancier and cut those values out, but this is a very fast and clean way to do what you're doing. You could always create another table with same field names, but 1 character shorter after, and then append to that from your imported table, and make those text fields 1 character shorter, it will automatically truncate the blank space at the end off.

Best Regards,
Scott
MIET, MASHRAE, CDCP, CDCS, CDCE, CTDC, CTIA, ATS

"Everything should be made as simple as possible, and no simpler." [hammer]

TamarGranor · Jun 6, 2017

Have you used the Debugger to see what's in the array laFile after your execute ALINES()?

Tamar

Olaf Doschke · Jun 6, 2017

What's your background? Do you come from MySQL and like the way MySQL dump creates a sql script to recreate a database? I think it's the worst approach to transfer data by creating repeated insert statements.

One of the core text only data formats is CSV (comma separated values) and using tabs instead of commas as delimiter is not far off.

VFP has APPEND for appending such a file to a DBF, so this is the way to go instead of trying to generate hundreds, thousands or even more sql inserts, I always thought of this as outright stupid solution of a MySQL developer and can't understand its success up to today.

Code:

TEXT TO lcImportfile NOSHOW
Policy 1	LocNum 1	Address 1	City___1	State__1	Zip 1
Policy 2	LocNum 2	Address 2	City___2	State__2	Zip 2
Policy 3	LocNum 3	Address 3	City___3	State__3	Zip 3
Policy 4	LocNum 4	Address 4	City___4	State__4	Zip 4
Policy 5	LocNum 5	Address 5	City___5	State__5	Zip 5
ENDTEXT
STRTOFILE(lcImportfile,"yourfile.txt")

CREATE CURSOR crsImport (vPolicy V(254), vLocNum V(254), vAddress V(254), vCity V(254), vState V(254), vZip V(254))
[b]APPEND FROM yourfile.txt TYPE  DELIMITED WITH TAB[/b]
BROWSE

It's just the one line you need and a dbf having the necessary columns, the rest of the code is just there to demo, test, and verify how the file is correctly imported if it exists as you showed. Notice VFP editor options can turn tabs to spaces, also this forum can, so to test this make sure there are tabs between values or just take your file to test it. By intention I added some underscores in City and State columns to let tabs stand out as lengthy gap.

But what's most important, you just need this one command to import such files.

If data is in certain types in the tab delimited file, you don't need to go through the V (varchar) type or char for the import, but that's surely the failsafe first conversion from a text file to DBF. You can also directly get integers and other types from CSV or tab delimited files, there are just several rules to learn, eg dates best are coming in the format yyyymmdd, not in any human readable formatted way, that's why sometimes it's best to just import data (text) as is into textual columns as I did here and then convert from this staging table/cursor to the final fields you want to fill, finding a transformation. This can also be a single [tt]INSERT INTO finaltable SELECT transformation1(field1), transformation2(field2)... FROM crsImport[/tt], and the transformation expressions (unfortunately) are something you can't already do within the APPEND, though this command has a few options including to specify a field list, but not a list of expressions, as you can specify in SQL statements. How could it be possible without the text file defining field or column names?

Last not least, don't take transformation1 literally and define functions with that name, it can be any expression like VAL(fieldx) to transform from a txt to a number or CAST(fieldy as M) to turn a string fieldy to a memo or whatever suits the conversion. In the end it is a two step import then, which often can be just a one step import, too. This doesn't make the file read faster, but it makes up for much less code to execute and much smaller files to exchange than that generated by MySQL Dump.

Bye, Olaf.

Olaf Doschke · Jun 7, 2017

By the way: I choose V() = varchar fields with maximum length, so you can easily find out the most probably lower max lengths needed after the APPPEND this way:

Code:

SELECT MAX(LEN(vPolicy) ) as nLenPolicy  ;
   ,   MAX(LEN(vLocNum) ) as nLenLocNum  ;
   ,   MAX(LEN(vAddress)) as nLenAddress ;
   ,   MAX(LEN(vCity)   ) as nLenCity    ;
   ,   MAX(LEN(vState)  ) as nLenState   ;
   ,   MAX(LEN(vZip)    ) as nLenZip     ;
   FROM crsImport

If you chose char fields instead, you'd need to RTRIM or ALLTRIM them. The length determined here are minimum length needed to store all values of the text file, some may be expected by definition, eg ZIP length, some may still need to be set higher, as there perhaps are longer city names not yet in the data, but you get an impression of typical data size, especially, if you yet have no final destination table for the data or you can check, whether your already existing target DBF cuts off some values, if its fields are not dimensioned wide enough.

Bye, Olaf.

shawnwmorris · Jun 7, 2017

My goal isn't to append to a table it is to update tables. Will this approach be the same?

Mike Lewis · Jun 7, 2017

My goal isn't to append to a table it is to update tables. Will this approach be the same?

Definitely. Once you have the data in a table (a DBF), it will be easy to use that data to update other tables. That's true regardless of whether the target tables are DBFs or are in a back-end database like MySQL. You would simply use the normal INSERT, UPDATE, REPLACE commands or whatever else takes your fancy to do the update.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

Olaf Doschke · Jun 7, 2017

There is no command updating from a text file directly, so you go through such a staging cursor or table. It would be cumbersome to instead read the text parsing it line by line to generate update statements, you do that from the data you get into a cursor as I created one.

Then you for example do:

Code:

UPDATE targettable FROM crsImport SET field1 = crsImport.field1, field2= crsImport.field2, ... etc... WHERE targettable.ID = crsImport.ID

Notice: This will update all records having an ID match in crsImport, not just one record. So it should perform nice instead of reading the txt line by line and parse it and generate single record updates.
IDs you have in crsImport but not yet in your targetttable then could be inserted into targettable, after you delete all records with existing IDs, you already used for updates, that means:

Code:

DELETE FROM crsImport WHERE ID IN (SELECT ID from targettable)

and finally, as crsImport now only has rows with a new ID:

Code:

INSERT INTO taregettable SELECT * FROM crsImport

Warning: This is an oversimplification, as you need to take care for same names, field order and types. You typically will process the crsImport generated for the APPEND FROM the tab delimited file to a final staging table or cursor, but as I already said in the simplest case crsImport can be correct in the first step already, if the text values can be converted to the right data types you need in your target table, or you do the transformation in this step, i.e. SET field1 = transformation1(crsImport.field1, crsImport.field2) or such things, if two fields are merged into one.

In your sample case, I don't know what field to use as unique ID, but you should have something, perhaps two or three fields combined, that make it a match with already imported data to update with new address or price or whatever you intend to import from a new text file version into the data already imported last time. Ideally the text file will have IDs, that make clear which record they are.

And in the special case the TXT file always is the up-to-date data of what you want in some DBF, then actually not trying to merge data, but simply import the full text file as the newest state data is the solution. So if you get a full list of persons or stores and their addresses from some company as a text file download, and this text file is not just the changes since last export, but always the full list, then there is no need to merge, then you empty your DBF and fill it completely new. So what you do also depends very strongly on what the text file contains. It does not make sense to import a full sett of data via UPDATEs, as that just overwrites all records you have with their new state, you can also simply recreate the DBF from the new text file.

The topic is not straight forward, one thing most often unknown is what records where deleted in comparison with the last export, because, well, deleted data does not get exported from the database the text file comes from.

Bye, Olaf.

Olaf Doschke · Jun 13, 2017

Shawn, are you getting forward? Did our suggestions solve your problem?

If so, to show appreciation for the help received, and help others who may be looking at this post for the help with their issue, click on the Great post! link in the most helpful post to award a star to whoever helped you, a popup will ask you to confirm, if you have a blocker make an exception for Tek-Tips you should need this to log in anyway.

And of course, ask about what you didn't understand.

Bye, Olaf.

shawnwmorris · Jun 13, 2017

Hello Everyone,
I am working on digesting all of this information right now. I am thankful for everyone's input it is certainly appreciated. I will update soon as to which approach I take. I will probably need some more guidance on this project. Thanks again.

Shawn Morris

shawnwmorris · Jun 14, 2017

So I think after re-reading all of the great insight here that I am going to do the following:
1. Create a table from the text file
2. Create a massive update statement i.e. update table1 set table1.field1=Newtable.field1 WHERE Table1.key1=NewTable.Key1 AND Table1.secondKey = NewTable.secondKey

I am open to any suggestions as to other approaches. Thanks again for all of the help

Olaf Doschke · Jun 14, 2017

That's fine, almost what I suggested, just using a table instead of cursor. OK, just notice you need to ZAP it for the next import or you keep all imported data and always append to it, until it maybe fails on the 2GB limit. If your text file is large, that can easily happen. A cursor will get disposed and the create statement regenerates it empty next time.

I wonder how you will build an import table from the text file, but you'll have an idea about that. The demo code I gave needs to know what it imports, to have a cursor with appropriate number and size and types of fields. These things don't show in neither CSV nor tab delimited text files as header data or accompanying definitions, it just contains the pure data and expects you to know its structure.

As you have a goal of where this data should end, you should have an idea what the data is, there is no general text import or appending, even less a genration of import or staging cursor or tables. You can use a cursor with at max 254 char(254) fields as a very general import structure, unused fields will remain empty. Memo fields could take longer strings, but won't be the target fields of APPEND, so you're bound to use classic char fields for import for the very general case. And 254 is not only the length limit for char fields, but also number of max fields. Maybe also 255, but that's the magnitude and both limits are below the typical text file needs anyway. But even such a universally usable cursor with say c1 to c354 c(254) fields still does not make the data end up where you need it to be. You need to know what data to expect in which field number to get it into the final target table, there's no way around that and that is the core developer task in such an import job, if that could be decided by code a general import routine command or function would exist.

So overall the process of writing code for a text file import always includes analysis of the structure and if you feel like inventing a limited type infering artificial intelligence you might take a try at making a general import routine detecting field types by parsing the text lines. It breaks down to analyze a portion of chars between tabs or of course line end/begin, what it could be, float, integer, date, etc., but even solving that wouldn't tell you whether a number is a price, a postal code, a count, a measurement of any unit or whatever a number may represent.

If your text files vary every time, this is not something to write an import routine, maybe some helper routines of general interest like conversion routines from character to other types.

If this is a regular import you should assume changing data, obviously, but only a rarely changing scheme of fields. And you should also know this type of data files is not coming with schema description. But it is magnitudes harder to aim for a general import, than the import of the sample file typically will have to design your import code.

Bye, Olaf.

shawnwmorris · Jun 14, 2017

Olaf,
As always you're information is great and helpful.

I can use the import wizard to create the new table. I plan on delete the tables once I am done.

Thanks again

Olaf Doschke · Jun 14, 2017

I see. Yes, the import wizard is as close as VFP gets to AI, but step 2 will ask you to determine the data format. One of the setttings it allows is which row field names may be in, if there is no such row, you get no meaningful field names and step 3 asks you about field types. This is about the same as what a CREATE CURSOR asks you to do, just interactively. I'd say more cumbersome than coding it, but that's your choice.

Bye, Olaf.

shawnwmorris · Jun 14, 2017

There is some logic flaw in my update statement. Should I post that question here or in a new thread?

Scott24x7 · Jun 14, 2017

New one.

Best Regards,
Scott
MIET, MASHRAE, CDCP, CDCS, CDCE, CTDC, CTIA, ATS

"Everything should be made as simple as possible, and no simpler." [hammer]

Olaf Doschke · Jun 14, 2017

I agree, start a new thread. I just wonder, if this is a one time job, why you don't import into the existing table or make the new table the import wizard generates the final target table.

Bye, Olaf.

shawnwmorris · Jun 14, 2017

Actually I have figured the problems out with this project.

I am importing the data into a new table with the wizard and then doing an update join query for the target table. The target table is so large and I am only updating 2 fields that it made more sense to me to only do a small update query. It isn't pretty but it works. I will close this thread. It was very helpful and I have learned quite a bit about FoxPro.

I have a new project coming up (which will probably require some more input from the community here) but must finish this one first. Thanks again everyone for you guidance.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Parsing a Tab-Delimited File 1

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Similar threads

Log in

Part and Inventory Search

Sponsor