Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

using LOAD DATA INFILE; field contains embedded quotes/commas

Status
Not open for further replies.

GregTheGeek

Programmer
Aug 11, 2004
46
US
Hello all,

I am trying to load a CSV into a mySQL table using the LOAD DATA INFILE command. There is (at least) one record that is causing an issue with the load. A text field in the record *contains* quotes and commas.

Example:
Code:
{...},ABCD,"This is some text in a field. It contains commas "","",and quotes""",Y,{...}

The pattern of commas and quotes are exactly as they appear in the file, but the actual data has been altered to protect my job. ;-)

My statement begins as follows:
Code:
LOAD DATA INFILE 'D:/path/to/file.csv'  
	INTO TABLE db_Name.Table_Name 
	FIELDS 
		TERMINATED BY ',' 
		OPTIONALLY ENCLOSED BY '"' 
		ESCAPED BY '' 
	LINES 
		TERMINATED BY '\r\n' 
	IGNORE 6 LINES

Any suggestions? Like I said, its seems to be only one record causing this issue, but there may be more in the future. I would really like to avoid manipulating this data before loading it. Is there some way to ignore the record if it causes an error? (Incorrect datetime value, Truncated incorrect INTEGER value, etc)

Thanks in advance!

gtg.jpg

GTG
 
Do you have any control over the format of the CSV file? Whenever possible I always use a tab character to separate fields in a CSV file - I guess it should then be called a TSV file.

Tabs are frequently preferable to commas to separate fields in such files because of the problem you are experiencing. So try something like
Code:
LOAD DATA INFILE 'D:/path/to/file.csv'  
    INTO TABLE db_Name.Table_Name 
    FIELDS 
        TERMINATED BY '\t'  
    LINES 
        TERMINATED BY '\r\n' 
    IGNORE 6 LINES

Andrew
Hampshire, UK
 
Andrew, thanks for the input. Unfortunately, I do not have control over the format of the file. I wouldn't be here if I did. :-D

gtg.jpg

GTG
 
well, if you actually have a csv, then you should be able to load it into excel successfully, yes?

then just export it again differently, with tabs

r937.com | rudy.ca
Buy my new book Simply SQL from Amazon
 
r937,

There are a few issues with your suggestion:

1. Excel does not parse the fields correctly either. I would end up with extra fields in the problem record(s).

2. Excel should never be used as a solution in this way. Never. It's not cross-platform (meaning it doesn't run on Linux) and, leading to my next issue:

3. Excel has a way of changing your data. If you have a field value of "2E11", Excel automatically assumes you mean "2+E11", which is exponential notation for a 2 with 11 zeros after it. If you have a field that starts with a minus sign (i.e. "-this is my text"), it assumes you meant to negate a value, so it prepends your data with an equal sign as to make it a formula (does the same if your field value starts with a plus sign, too).

Excel is not a DBMS. This is why we have MySQL, PostgreSQL, MS SQL Server, and yes, even Access.

Is there anyone out there that knows of a MySQL-based solution?

Thanks again!

gtg.jpg

GTG
 
Have you tried to write some code to clean it up before you give it to MySQL.
Rudy, if you need a few mins on what a DBMS is just ask !!:-D
(Greg, check some of Rudy's posts or even his book !)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top