Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Importing Data from a Text File 1

Status
Not open for further replies.

David Higgs

Programmer
May 6, 2012
392
GB
In Amateur Radio circles there are numerous Software Based Logbooks. There is a specification called ADIF (Amateur Data Interchange Format) which Software Authors are encouraged to follow.

In example, I have a Text File with the following content.

Text File Contents
<Callsign:5>A1ABC <name:3>Joe <frequency:8>7.163000
<band:3>40m <mode:3>SSB <qso_date:8>20170220
<EOR>
<Callsign:5>A2DEF <name:5>David <frequency:8>7.160000
<band:3>40m <mode:3>SSB <qso_date:8>20170220
<WAB_Area:4> SD55
<EOR>
<Callsign:5>A1GHI <name:4>Fred <frequency:8>7.126000
<band:3>40m <mode:3>SSB <qso_date:8>20170220
<EOR>

The format of the above is as follows:-
<Callsign:5>A1ABC
<Field Name:Length of Data>Data which gives a Field Name of Callsign and Data A1ABC (In this instance the Data = 5 Characters, it can vary in length)
Each Record in the Text File can have some or all of the Fields and the Fields could possibly be in random order.

The full specification of the ADIF file is available here.

How would I go about Importing and Converting the original Text file to achieve a Table with the following format?
Table Contents after 'Import"

Callsign Name Frequency Band Mode Date
A1ABC Joe 7.163000 40m SSB 20170220
A2DEF David 7.160000 40m SSB 20170220
A1GHI Fred 7.126000 40m SSB 20170220



Regards,
David
 
Ok, David, then probably is a difference between the machines we both are using for this.

Changed the ADI source to

Code:
LOCAL ADISource AS String
LOCAL ADISampleEntry AS String

TEXT TO m.ADISampleEntry TEXTMERGE NOSHOW
<Callsign:5>A1ABC <name:3>Joe <frequency:8>7.163000
<band:3>40m <mode:3>SSB <qso_date:8>20170220
<<REPLICATE("<dummy:3>some data",60)>>
<EOR>
ENDTEXT

m.ADISource = REPLICATE(m.ADISampleEntry, 4000)

which gives 4000 entries of 66 fields, of which 6 are read into the cursor, and this is done in 2-3 seconds.
 
For comparison, using my Laptop (which I've used throughout these tests) it took 6 Secs to complete the test with the above code. I will do some tests on my Desk Top PC which is a better spec system.


Regards,
David
 
I don't know, all versions of atlopes code read in all adifields, no matter how many of them you put into your cursor or not. The only difference could really be about compobj, the hdd speed may also play a role, especially if you compare running times on different computers. Caching falsifies the time needed by further runs. Change order of what you run and the first code you run will really read from hdd, further runs read from cache, that is OS file cache, not vfp dbf/data related caches or buffers. I think you have to live with the minute run, as you say yourself it's acceptable anyway. Don't pay too much attention on why things run with what speed here.

Bye, Olaf.
 
I think I have found the reason why my system was taking up to 50 Secs to complete the Data Import.

If I change:-

Code:
m.ADISource = ALLTRIM(CHRTRAN(FILETOSTR("C:\temp\TEST.ADI"), CHR(13) + CHR(10) + CHR(9), ""))

to:

Code:
m.ADISource = FILETOSTR("C:\temp\TEST.ADI")

The process time for 3124 Records is now down to 4 Secs ( between 1 & 2 Secs on my Desktop PC ). I don't think the change will have any implications on the functioning of the program?

Regards,
David
 
That depends if you're trimming leading or trailing whitespace off the data afterward in the process, or not. Except for the first version, that trimming is being done, so keeping the trimming beforehand is unnecessary and you can avoid it.

Evaluating [tt]TYPE("m.ADIBuffer." + m.ADIField)[/tt] just once and testing for "U" values before going into the [tt]DO CASE[/tt] structure cuts, even more, the required time to import the logs (on my sample imaginary data, that goes from 2-3 to 1-2 seconds).
 
atlopes said:
Evaluating TYPE("m.ADIBuffer." + m.ADIField) just once and testing for "U" values

Sorry, I don't understand, what is "U" values?

This is the current code:-
Code:
CREATE CURSOR ADILog ;
	(QSO_Date D, Time_On Varchar(8), Time_Off Varchar(8), Call Varchar(20), RST_Sent Varchar (8), RST_RCVD Varchar (8), Name Varchar(50), Freq B(6), Band Varchar(8),;
	 Mode Varchar(6), Address Varchar(8), CNTY Varchar(8), Comment Varchar(8), Cont Varchar(8), Country Varchar(8))
*
*	Import Data
*
LOCAL ADISource AS String
	m.ADISource = FILETOSTR("C:\temp\TEST.ADI")	&&	Sends lines of text to memory variable (m.ADISource)
*
* Skip header, if it exists
*
IF LEFT(m.ADISource, 1) != "<" && not equal to "<"
	m.ADISource = SUBSTR(m.ADISource, AT_C("<EOH>", m.ADISource) + LEN("<EOH>"))
ENDIF
*
*	Declare Local
*
LOCAL ARRAY ADIRecords[1]
LOCAL ADIRecord AS String
LOCAL ADIField AS String
LOCAL ADIValue AS String
LOCAL RecordsIndex AS Integer
LOCAL ADIBuffer AS Object
LOCAL ADIIsEmpty AS Boolean

* Go through each log entry
FOR m.RecordsIndex = 1 TO ALINES(m.ADIRecords, m.ADISource, 1+4+8, "<EOR>")	

		&&	ALINES() copies each line in a character expression 
		&&	or 
		&&	memo field to a corresponding row in an array and returns the number of lines copied.
		
		&&	1 = Removes leading and trailing spaces from lines, or for Varbinary and Blob values, removes trailing zeroes (0) instead of spaces.
		&&	4 = Do not include empty elements in the array.
		&&	8 = Specifies case-insensitive parsing.

	m.ADIRecord = m.ADIRecords[m.RecordsIndex]
	*
	* prepare an empty buffer to hold the data
	*
	SCATTER BLANK NAME m.ADIBuffer
	
	m.ADIIsEmpty = .T.		&& Mark String as Empty

	DO WHILE !EMPTY(m.ADIRecord)
		* fetch each field of the log entry
		m.ADIField = STREXTRACT(m.ADIRecord, "<", ":")			&&	Retrieves a string (Field Name) between two delimiters.
																	&&  
		m.ADIValue = STREXTRACT(m.ADIRecord, ">", "<", 1, 2)	&&	Retrieves a string (Field Value) between two delimiters.
		*
		* adjust the value to the data type of the cursor
		*
		DO CASE
		CASE TYPE("m.ADIBuffer." + m.ADIField) == "C"	&&	Character Field
			m.ADIBuffer.&ADIField. = ALLTRIM(m.ADIValue, 0, " ", CHR(13), CHR(9), CHR(10))	&& CHR(9) Horizontal TAB, (13) Carriage Return (10) Linefeed
			m.ADIIsEmpty = .F.
		CASE TYPE("m.ADIBuffer." + m.ADIField) == "N"	&&	Numeric Field
			m.ADIBuffer.&ADIField. = VAL(CHRTRAN(m.ADIValue, ".", SET("Point")))
			m.ADIIsEmpty = .F.
		CASE TYPE("m.ADIBuffer." + m.ADIField) == "D"	&&	Date Field
			m.ADIBuffer.&ADIField. = EVALUATE("{^" + TRANSFORM(m.ADIValue, "@R 9999-99-99") + "}")
			m.ADIIsEmpty = .F.
		ENDCASE
		
		m.ADIRecord = IIF(AT("<", m.ADIRecord, 2) = 0, "", SUBSTR(m.ADIRecord, AT("<", m.ADIRecord, 2)))
		
	ENDDO
	*
	* whatever it was found in the entry goes to the cursor
	*
	IF !m.ADIIsEmpty										&&  If String not empty
		INSERT INTO ADILog FROM NAME m.ADIBuffer		&&	Copy to adif_log whose property names match the field names in the table
	ENDIF													&&

ENDFOR

BROWSE



Regards,
David
 
Undefined values, in this context meaning that there is no field in the log cursor to store the field of the ADIF file.

Code:
CREATE CURSOR ADILog ;
	(QSO_Date D, Time_On Varchar(8), Time_Off Varchar(8), Call Varchar(20), RST_Sent Varchar (8), RST_RCVD Varchar (8), Name Varchar(50), Freq B(6), Band Varchar(8),;
	 Mode Varchar(6), Address Varchar(8), CNTY Varchar(8), Comment Varchar(8), Cont Varchar(8), Country Varchar(8))
*
*	Import Data
*
LOCAL ADISource AS String
	m.ADISource = FILETOSTR("C:\temp\TEST.ADI")	&&	Sends lines of text to memory variable (m.ADISource)
*
* Skip header, if it exists
*
IF LEFT(m.ADISource, 1) != "<" && not equal to "<"
	m.ADISource = SUBSTR(m.ADISource, AT_C("<EOH>", m.ADISource) + LEN("<EOH>"))
ENDIF
*
*	Declare Local
*
LOCAL ARRAY ADIRecords[1]
LOCAL ADIRecord AS String
LOCAL ADIField AS String
LOCAL ADIValue AS String
[highlight #FCE94F]LOCAL ADIType AS Character[/highlight]
LOCAL RecordsIndex AS Integer
LOCAL ADIBuffer AS Object
LOCAL ADIIsEmpty AS Boolean

* Go through each log entry
FOR m.RecordsIndex = 1 TO ALINES(m.ADIRecords, m.ADISource, 1+4+8, "<EOR>")	

		&&	ALINES() copies each line in a character expression 
		&&	or 
		&&	memo field to a corresponding row in an array and returns the number of lines copied.
		
		&&	1 = Removes leading and trailing spaces from lines, or for Varbinary and Blob values, removes trailing zeroes (0) instead of spaces.
		&&	4 = Do not include empty elements in the array.
		&&	8 = Specifies case-insensitive parsing.

	m.ADIRecord = m.ADIRecords[m.RecordsIndex]
	*
	* prepare an empty buffer to hold the data
	*
	SCATTER BLANK NAME m.ADIBuffer
	
	m.ADIIsEmpty = .T.		&& Mark String as Empty

	DO WHILE !EMPTY(m.ADIRecord)
		* fetch each field of the log entry
		m.ADIField = STREXTRACT(m.ADIRecord, "<", ":")			&&	Retrieves a string (Field Name) between two delimiters.
																	&&  
		m.ADIValue = STREXTRACT(m.ADIRecord, ">", "<", 1, 2)	&&	Retrieves a string (Field Value) between two delimiters.
		*
		* adjust the value to the data type of the cursor
		*
[highlight #FCE94F]		m.ADIType = TYPE("m.ADIBuffer." + m.ADIField)

		IF m.ADIType != "U"

			m.ADIIsEmpty = .F.

			DO CASE
			CASE TYPE("m.ADIBuffer." + m.ADIField) == "C"	&&	Character Field
				m.ADIBuffer.&ADIField. = ALLTRIM(m.ADIValue, 0, " ", CHR(13), CHR(9), CHR(10))	&& CHR(9) Horizontal TAB, (13) Carriage Return (10) Linefeed
			CASE TYPE("m.ADIBuffer." + m.ADIField) == "N"	&&	Numeric Field
				m.ADIBuffer.&ADIField. = VAL(CHRTRAN(m.ADIValue, ".", SET("Point")))
			CASE TYPE("m.ADIBuffer." + m.ADIField) == "D"	&&	Date Field
				m.ADIBuffer.&ADIField. = EVALUATE("{^" + TRANSFORM(m.ADIValue, "@R 9999-99-99") + "}")
			ENDCASE
		ENDIF
[/highlight]
		m.ADIRecord = IIF(AT("<", m.ADIRecord, 2) = 0, "", SUBSTR(m.ADIRecord, AT("<", m.ADIRecord, 2)))
		
	ENDDO
	*
	* whatever it was found in the entry goes to the cursor
	*
	IF !m.ADIIsEmpty										&&  If String not empty
		INSERT INTO ADILog FROM NAME m.ADIBuffer		&&	Copy to adif_log whose property names match the field names in the table
	ENDIF													&&

ENDFOR

BROWSE
 
Hi António,

Thank you for your explanation of the "Undefined" Fields and also for your revised code which took approximately 3 Secs on my Laptop and 1 Sec on my Desktop PC to process the imported data, that is pretty fast!!

Well, I don't there's much more to improve on so I can now start adding the additional fields and bring this exercise to a conclusion. I very much appreciate all of the help you have provided; as I said at the outset, I didn't expect to get a working solution on my first ask.

Regards,
David
 
That's great to read, David, and you're quite welcome.

The only prospective problems that you may face ahead are related to field names (for instance, having spaces in it, which won't be allowed in VFP) or length. Besides that, I think that you have a basis for a working solution (you may think of adding error handling, and so on).

António
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top