Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

BULK INSERT FORMATFILE Help

Status
Not open for further replies.

ehenry

Programmer
Sep 18, 2009
65
US
I am trying to use bulk insert to dump a bunch of CSV files into different tables. I am creating a dynamic stored procedure that takes in the csv filename and the table to insert to. Basically my issue is that some of the columns in the csv files have quotes around the values which I need to remove. Also some of the columns contain multiple values separated by commas. For this reason I thought it best to use a format file, However I am having trouble. Below is a small sample of one of the CSV files and then my format file:

Id,Name
11,"AL-FL-MS"
75,"Alaska"
69,"Alberta"
12,"Anadarko"

Format File:

9.0
2
1 SQLINT 0 2 "\,"" 1 ID ""
2 SQLCHAR 0 100 "\"\n" 2 Names ""

I am getting an error:
Cannot bulk load. Invalid data type for column number 2 in the format file "C:\basins.fmt".

Any help would be appreciated.

 
I see two problems.

First, change your SQLINT to SQLCHAR. If you had a binary file that stored this number, then you would use SQLINT. If you can open your data file in Notepad and clearly see all the data, then EVERY column should be listed as SQLCHAR because that is how it is stored in the data file.

Second.... your column delimiter for the first one is:

[tt][blue]"\,""[/blue][/tt]

This is not quire right. The delimiter is everything inside the quotes. The \ is the escape character. Anyway... you should try:

Code:
9.0
2
1 SQLCHAR 0 2 ",\""    1 ID ""
2 SQLCHAR 0 100 "\"\n" 2 Names ""



-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Thank you I am no longer getting an error, however the file is not being imported to the table. Here is my T-SQL:

DECLARE @bulk_cmd varchar(1000)

SET @bulk_cmd = 'BULK INSERT tblbasins FROM ''C:\Program Files\Bentek\Files\basins.csv''
WITH (FIRSTROW = 2, FORMATFILE = ''C:\basins.fmt'')'
EXEC (@bulk_cmd)
 
You don't need to use dynamic SQL here.

Try using \r in your format file instead of \n

Also, I notice that your first row is not formatted the same as your data. Specifically, since you are using a format file, your header row must match your format file. Since you are using FIRSTROW = 2, the first row will be skipped, BUT how does SQL Server know where the first row end? It needs to follow the same format as your whole file. so...

Code:
Id,[!]"[/!]Name[!]"[/!]
11,"AL-FL-MS"
75,"Alaska"
69,"Alberta"
12,"Anadarko"

Based on your format file, SQL thinks your first row ends with a quote-Carriage Return. Either remove the first row from your data file (and set FIRSTROW = 1) or add the quotes.

Here's what I think you should do...

FormatFile - notice no row terminator
Code:
9.0
2
1 SQLCHAR 0 0 ",\"" 1 ID ""
2 SQLCHAR 0 0 "\"" 2 Names ""

DataFile - notice the quotes on the first line.
Code:
id,"Name"
11,"AL-FL-MS"
75,"Alaska"
69,"Alberta"
12,"Anadarko"

bulk insert code - notice the ROWTERMINATOR
Code:
DECLARE @bulk_cmd varchar(1000)

SET @bulk_cmd = 'BULK INSERT tblbasins FROM ''C:\Program Files\Bentek\Files\basins.csv''
WITH (FIRSTROW = 2, FORMATFILE = ''C:\basins.fmt'', ROWTERMINATOR=''\r'')'
EXEC (@bulk_cmd)

Hope this helps.

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Thank you, this is helpful. A few things. I am using dynamic SQL because the file name and table are actually variables in my real code, since I am importing about 50 files per day from an ftp server. I don't know of any way to programatically remove or modify the header rows so I may need to find another way to do this possibly from the .NET application I plan to have download the files and run the import proc. I could also drop everything into a #temp table and modify it there.
 
Clearly, there's a couple ways to do this.

Do not be tempted to open the data file in .net and loop over the rows inserting each line separately. This is painfully slow.

You could dump in to a #Temp table, but be careful here too. I mean... you could eliminate the format file altogether and bulk insert to a temp table that has one really big column, and then parse with T-SQL. You should avoid this method too, because it will be slow.

You're best bet would be to open the data file in a front end language and modify it. Either remove the first line, or add the quotes, then do the Bulk Insert.

There are other options available to you too.

6 ways to import data into SQL Server

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
I think I will take your advice and modify each file, removing the first line. Thanks again for your help.
 
I am getting the following error when attempting the Bulk import. I have my T-SQL, the data file and the format file exactly like you posted above. I keep getting this error:

Msg 4832, Level 16, State 1, Line 1
Bulk load: An unexpected end of file was encountered in the data file.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".

I thought it might be because of no return at the end of the file but I put in a blank line at the end. I can't seem to get this thing to work right!?!?!
 
I can send you a copy of the file if you want to test it yourself.
 
I was testing this yesterday. Check your data file. Is there a blank row at the end?

For example:

Code:
id,"Name"
11,"AL-FL-MS"
75,"Alaska"
69,"Alberta"
12,"Anadarko"
 <-- Blank row here

If there is, you can accommodate this by modifying your BULK INSERT command.

Code:
DECLARE @bulk_cmd varchar(1000)

SET @bulk_cmd = 'BULK INSERT tblbasins FROM ''C:\Program Files\Bentek\Files\basins.csv''
WITH (FIRSTROW = 2, FORMATFILE = ''C:\basins.fmt'', ROWTERMINATOR=''\r''[!], MAXERRORS = 1[/!])'
EXEC (@bulk_cmd)

By setting Max Errors = 1, this will allow there to be exactly one error during the import, and it will still succeed.

Give this a try and let me know how you make out.

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
MAXERRORS does not work, however when I open the file and delete the last empty row it works. So I guess the final solution is to modify each file from my .net app and remove the first and last row.
 
There is one more "solution" that you may want to try. A decidedly low-tech one.

You could contact the supplier of this data and request that they "fix" the data. There's about a million reasons why this will NOT work, but if it's possible, it'll be the easiest one for you to implement. [wink]

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top