Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Load Excel file to Grid or cursor to analyze (without knowing anything about the fields)

Status
Not open for further replies.

dantheinfoman

Programmer
May 5, 2015
131
US
Hi All,

I'm wondering if it's possible to load an excel file into a cursor or directly to a grid (or just some way to analyze the column headers or record number 1 or 2 in foxpro). So consider this crummy example excel file. Let's also pretend that the commas below represent delimiting between different cells in Excel.

Emp, LName, Fname
100, Jones, Billy
101, Smith, John
102, Doe, Jane

I'm wanting a way to look at items without having to know what fields are (character, logical, etc) nor how big they are. I just want to look at the data as it is and then determine how I want to classify it later.

Anyone have any experience in analyzing excel files and importing or manipulating the data. I just want to look at it using Foxpro without knowing anything about the order of Excel fields, because files may be different all the time.

Thanks

Dan
 
I am currently working on a VFP class to be able to read an XLSX file and load it into cursors. Once loaded you will be then able to query each cell (column:row pair) and determine what data type, data value, and if it is a formula. The class can presently read and load the XLSX file -- I have to work on the methods to then retrieve the cell information.

This class first started with the ability to write an XLSX file and this part is fully functional. It will be posted to the Foxite forum when it is complete (the last version posted supports writing to XLSX formats).
 
One option would be to create a temporary table with x number of character fields, x representing the maximum number of columns your input file can have. Define each field with a length of y characters, y representing the maximum column width your input data can have. Import data from your ',' delimited excel file to this temporary table and browse the data imported.

 
Anyone have any experience in analyzing excel files and importing or manipulating the data. I just want to look at it using Foxpro without knowing anything about the order of Excel fields, because files may be different all the time.

There's good news and bad news here.

The good news is that the VFP import wizard is GREAT at this kind of thing! The bad news is that it only works on XLS files (and not even the most recent of them). The other bad news is that XSLX files are too recent to be supported.

(Of course XSLX files are really just ZIP files with some XML data inside them.)

Since YOU (yes, you) have the full source for the VFP wizards in your VFP installation, you can feel free to st^H^H borrow any techniques in there that you find useful.

There is code floating around for reading/writing XSLX files if your Google fu is up to it.
 
Hi all

Thanks for your posts.

So assuming I was cool with receiving XLS format for the time being, I think everyone's saying I cannot simply toss it into a cursor and immediately analyze it, but rather I must create a table from the XLS file instead of a temporary cursor.

If that's true, what kind of commands can make that happen please?

Thanks for your thoughtful consideration!

Dan
 
So assuming I was cool with receiving XLS format for the time being, I think everyone's saying I cannot simply toss it into a cursor and immediately analyze it, but rather I must create a table from the XLS file instead of a temporary cursor.

You do realize that "table" and "cursor" are the same thing, right?

 
Check this thread : thread184-1633357
I follow these steps
Use oledb to import to a cursor of memo fields.
calculate the max used length of each field(n). if it is less than 255, then alter the field type to a C(n)
for each field, find the first non-empty data, most often than not, that is a column heading.
compare each heading to a list of headings you deal with most often so that you can standardize, and suggest field names. offer a default column name that matches the excel column, such as col_a
some information in the column names can also help you determine the final data type
'pct', 'amt', 'date', etc.
 
Anyone have any experience in analyzing excel files and importing or manipulating the data.

Yes. I recently developed an application that does just that.

I did it by creating a cursor that contains x fields, each field being a character field of y characters, where x is the largest number of columns the worksheet is likely to contain, and y is each column's largest likely width.

I then used the APPEND FROM command to read the worksheet into the cursor.

And then I looped through the cursor, one field at a time, to determine the data type, size and precision of each field. This was a bit hit and miss, but it worked reasonably well. For example, if a given field only contained digits, spaces and certain symbols (minus sign, comma, full stop), then I determined it was numeric. And so on.

Finally, I created a new cursor. The fields in this new cursor matched the data typse, widths and precision that I determined in the previous step. I then generated a SQL SELECT statement to copy the data from the first to the second cursor, changing the data types, etc. along the way.

The whole thing was about 200 lines of code. It has been in use for several months now, and it seems to work OK almost all the time.

Another option would be to use the IMPORT command, which does most of the above for you. In my case, I found my own solution more reliable, but that might not be the case for you.

Mike





__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
Thanks Mike!

I'm working on this, but it's slow and crappy. Thanks for the push in the right direction. I'm trying to figure out how to do the resize thing as well as how to analyze the data types. Are you using SCAN and going through the top ten or so results? How do you determine the data types if it's all considered Character?

Thanks for any insight!

Dan
 
Dan,

I'll need to remind myself about this. I'll look up my code, and report back.

In the meantime, don't neglect the IMPORT command. I decided not to use it because my worksheets were too complex, but it might well work for you. If you haven't already done so, give it a try before going any further.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
Dan,

I've just been looking through my code. I can't post the actual code here, but I can give you an idea of how it works. I hope this will be enough to get you started.

To recap, you should now have a cursor (called, say, csrTest) with one field per Excel column. Each field is a character field (named, say, F1, F2, ....) and is the width of the largest likely cell value (say 32 characters).

You need to do the following for each field in turn.

To determine if the field is a numeric:

Code:
SELECT F1 FROM csrTest WHERE TRANSFORM(VAL(F1)) == F1 INTO CURSOR csrJunk
IF _TALLY = RECCOUNT("csrTest")
  * Field F1 is numeric
ENDIF

Determining if a field is a date is a little harder. First, apply the above test to see if it is numeric (because Excel dates are stored as numbers, these being offsets from a base date). If it is numeric, then you need to check that it represents a "reasonable" date. To do that, you need to define "reasonable". In my case, I was looking for birthdays of living people, so I made a guess at the earliest and latest likely birth dates, and checked that the target date was within that range:

Code:
ldLow = .... && lowest likely date
ldHigh = ..... && highest likely date
ldBase = {^1899-12-30}  && base date in Excel
SELECT F1 FROM csrTest WHERE BETWEEN(ldBase + VAL(F1), ldLow, ldHigh) INTO CURSOR csrJunk
IF _TALLY = RECCOUNT("csrTest")
  * Field F1 is a likely date
ENDIF

Assuming that if a field is not numeric or a date then it is a character, you now need to know the maximum width of each character field:

Code:
SELECT MAX(LEN(ALLTRIM(F1))) FROM csrTest INTO ARRAY laJunk
* laJunk(1) now gives the required width

Once you've got all that information in place, you build a SQL SELECT that selects all the data from the original cursor, converting it to the target date type and width, and storing the result in the final cursor. Then, having built the SELECT, you macro-execute it.

I'll leave that final step as an "exercise for the reader".

If all this sounds like too much trouble, do keep in mind my alternative suggestion: use the IMPORT command. However, that will give you the same issue with dates. These will appear in your results as numbers, and you will need a way to check that they represent reasonable dates.

Mike




__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
Hi Mike,

Thanks! I'm working on this today and I didn't see the bottom part of your post. Would've probably saved me a lot of time, because I went the long route and it's still giving me errors, saying something about the CDX file is missing for my cursor. Ugh. I will try your max length thing. Here's what my silly self did:

Code:
WITH thisform.pf.p1.grdHeaders

	SELECT (.aliasname)
	nFields = FCOUNT(.aliasname)
	nRecs = RECCOUNT(.aliasname)
	thisform.pf.p1.lblFlds.Caption= TRANSFORM(nFields)
	thisform.pf.p1.lblRecs.Caption= TRANSFORM(nRecs)
	
	*loop through the columns
	mylen = 0
	FOR y = 1 TO nFields
		*loop through each record
		GO top
		SCAN
			mynewlen= LEN(ALLTRIM(EVALUATE(FIELD(y))))
			IF mynewlen > mylen then
				mylen = mynewlen
			ENDIF			
		ENDSCAN
		*what was the largest length for this field? Make the field that length!
		*WAIT WINDOW (FIELD(y)) && gives field name, such as F1, or F2 etc. . .
		ALTER table (.aliasname) alter COLUMN (FIELD(y)) c(mylen)
		mylen = 0  &&reset
		
	ENDFOR
	THISFORM.Refresh
	THISFORM.PF.P1.GrdHeaders.Refresh

ENDWITH

Thanks All, I will report back with a better solution, since my best efforts currently suck pretty bad. Mike, I'll keep trying to utilize your thoughtful codes, thanks!!!

Dan
 
Hi vgulielmus,

Thanks for the link,

I tried the demo and it works great on the included sample XLSX sample files! However, in my line of work, I'll probably have to be ready for xls and (maybe someday) csv files as well.

I'm not sure how to make this accept XLS or XLSX files, it's designed for only XLS. If I change it to look at the renamed file with a .XLS extension, it complains about not having XML information and bombs out.

I don't understand the code enough to gleam the good parts for data and character types, so I'm going to re-assess MikeLewis's code.


Thanks All,

Dan
 
Assuming your xls file follows some rules (the first row contains column headers and all the following rows contains data), here is a version of Mike Lewis' solution :

Code:
LOCAL lnfields,laFields[1],lcAlias,lni,lcFName,curPrec,lcSetDele
CLOSE DATABASES ALL 
* import the xls
IMPORT from test.xls XL5 
* save into an array, the structure of the new table
lnfields = AFIELDS(laFields)
* the alias of the newly created table
lcAlias = ALIAS()
* a temporary cursor to calculate the size and precision of the numeric fields
curPrec = SYS(2015)
* skip the first row (containing the column header) and get the data type from the second
GO 2
FOR lni = 1 TO lnFields
	lcFName = lafields[m.lni,1]
	DO CASE
	* numerci fields
	CASE TRANSFORM(VAL(EVALUATE(m.lcAlias + "." + m.lcFName))) == ALLTRIM(EVALUATE(m.lcAlias + "." + m.lcFName))and the precision
		* calculate the size 
		SELECT CAST(MAX(LEN(ALLTRIM(&lcFName))) as I) as leng,;
			CAST(MAX(IIF(AT(".",&lcFName) > 0,LEN(ALLTRIM(&lcFName)) - AT(".",ALLTRIM(&lcFName)), 0)) as I) as dec;
			FROM (m.lcAlias) INTO CURSOR (m.curprec)
		* update the array with new data type, size and precision
		laFields[m.lni,2] = 'N'
		laFields[m.lni,3] = &curprec..leng
		laFields[m.lni,4] = &curprec..dec
	* date data type
	CASE !EMPTY(CTOD(EVALUATE(m.lcAlias + "." + m.lcFName)))
		laFields[m.lni,2] = 'D'
		laFields[m.lni,3] = 8
		laFields[m.lni,4] = 0
	* datetime data type
	CASE !EMPTY(CTOT(EVALUATE(m.lcAlias + "." + m.lcFName)))
		laFields[m.lni,2] = 'T'
		laFields[m.lni,3] = 8
		laFields[m.lni,4] = 0
	ENDCASE
NEXT

* delete the first row, containing colun header and SET DELETED ON to skip this row
GO 1 IN (m.lcAlias)
DELETE IN (m.lcAlias)

lcSetDele = SET("Deleted")
SET DEleted ON

* create the result, with the new data type
CREATE CURSOR cResult FROM ARRAY laFields
* append all the data, except the first (deleted) row
APPEND FROM (m.lcAlias)

* restore SET DELETED
SET DEleted &lcSetDele

* show the data
SELECT cResult
BROWSE

This code assumes that if in the second row a value is a number/date/datetime, then the entire column contains numbers/dates/datetimes (except the column header)

Respectfully,
Vilhelm-Ion Praisach
Resita, Romania
 
Nice code, Vilhelm-Ion. It deomonstrates a useful technique that I missed in my own version:

AFIELDS() is a (sort of) mirror image of CREATE CURSOR ... FROM ARRAY. In other words, the array that is created by AFIELDS() can be used to create a new cursor with CREATE CURSOR; and you can adjust the field names, field widths, data types and precisions between the two operations. This makes it perfect for this particular problem.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top