Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Excel Reader Component

Status
Not open for further replies.

Smeat

Programmer
Mar 27, 2004
193
GB
Does anybody know of a good thrid party component I can use to read data from an Excel (.xls) file? Nothing fancy required, I just need to read data and insert it into a database.

I've been writing a component to do this using both the Office interop components and the OleDb provider and have found that both methods have so many quirks they are unusable.

TIA
Smeat
 
if all you need to do is read the data from excel than why doesn't OleDb or Excel.InterOp work? once you select the data you can do whatever you want, including putting it into a RDBMS.

Jason Meckley
Programmer
Specialty Bakers, Inc.

faq855-7190
faq732-7259
 
You would think so but unfortunately it doesn't work consistently.

For example, one file I was testing this morning had 2 date fields side by side, both in the same format but the oledb provider returned one date but only an empty string for the other date.

Ther are numerous quirks like this, such as numbers being turned into scientific, etc depending on the cell formatting used in the .xls file.

I'm now at the point where I beleieve the OleDb provider cannot reliably get the data so am looking for a proven third party component.

Thanks
Smeat
 
are you sure the data is reliable? just because it's formatted in excel doesn't make it that data type. if the excel worksheet looks like a flat file data dump you can query it. if it's designed to look pretty (or usable) to an end user there may be issues because the data is for presentations, not queries.

using the Excel.Interop, try saving the worksheet as a CSV file and then query the CSV. open the CSV in notepad to ensure the data is what you expect. for example the date is a date, and not an empty string.

Jason Meckley
Programmer
Specialty Bakers, Inc.

faq855-7190
faq732-7259
 
Yes I'm sure the data is reliable.

In my test file I have 2 columns which have their cell format set to dd/mm/yyyy.

When I attempt to read the data from the first column it always returns the correct data but the second column is sometimes empty.

Here is how my code looks:

Code:
 string connString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + FullyQualifiedFilePath + ";Extended Properties=\"Excel 8.0;HDR=No\"";


System.Data.OleDb.OleDbConnection oledbConn = new System.Data.OleDb.OleDbConnection(connString);

try
{

  oledbConn.Open();

  OleDbCommand cmd = new OleDbCommand("SELECT * FROM [" + workSheetName + "]", oledbConn);

  OleDbDataAdapter oleda = new OleDbDataAdapter();
  oleda.SelectCommand = cmd;

  DataSet ds = new DataSet();

  DataTable dt = new DataTable("ServiceData");
  ds.Tables.Add(dt);

  oleda.Fill(dt);

  foreach (DataRow r in dt.Rows)
  {
    string rowData = string.Empty; ;
    foreach (object o in r.ItemArray)
    {
      rowData += o.ToString();
    }
  }

If I put a breakpoint within the foreach loop and then query the first date element I get the correct results as below:

Code:
? r[16]
{26/02/2010 13:52:47}
    Date: {26/02/2010 00:00:00}
    Day: 26
    DayOfWeek: Friday
    DayOfYear: 57
    Hour: 13
    Kind: Unspecified
    Millisecond: 750
    Minute: 52
    Month: 2
    Second: 47
    Ticks: 634027891677500000
    TimeOfDay: {13:52:47.7500000}
    Year: 2010

but the second date column should read 07/01/2005 but is empty as below:

Code:
? r[17]
{}

Just to make this behaviour really strange, if I do the same query on the third row I get the correct results from both fields although the second date is returned as a string data type rather than a date.

Code:
? r[16]
{26/02/2010 13:19:05}
    Date: {26/02/2010 00:00:00}
    Day: 26
    DayOfWeek: Friday
    DayOfYear: 57
    Hour: 13
    Kind: Unspecified
    Millisecond: 193
    Minute: 19
    Month: 2
    Second: 5
    Ticks: 634027871451930000
    TimeOfDay: {13:19:05.1930000}
    Year: 2010


? r[17]
"10/03/2007"

I have triple checked the format of each of the date columns is the same for all cells, the only difference is that the first date column incldues the time element in the data.

Here's the first 3 rows from my data file:
Code:
1 M C C 13 K C F W GD UK W 0 X F E915 26/02/2010 01:00:00 07/01/2005
2 M D M	24 T R R W GD UK S 1 Y E E915 26/02/2010 02:30:00 01/03/2004

I'm completely stumped by this behaviour, any thoughts ?

TIA
Smeat
 
save the excel file as a csv.
open the file in notepad to ensure the data is correct.
if so continue... otherwise there is a problem with the data

use ado.net to query the csv file.
confirm the results are correct.
if so the problem is either the connection string or the excel file.

you can also simplify your code. to this
Code:
var table = new DataTable();
table.Load(command.ExecuteReader());
no need for the data adapter or dataset.

Jason Meckley
Programmer
Specialty Bakers, Inc.

faq855-7190
faq732-7259
 
Hi Jason

I've already tried this and everything seems fine.

Saving as a .csv and viewing in notepad shows the expected dates:

Code:
26/02/2010,07/01/2005
26/02/2010,01/03/2004
26/02/2010,10/03/2007

I'll try querying the .csv file just out of curiosity but in practice I will be reciving .xls files.

Thanks
Smeat
 
understood, querying the CSV helps rule out problems with your code.

Jason Meckley
Programmer
Specialty Bakers, Inc.

faq855-7190
faq732-7259
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top