Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Removing " from csv files?

Status
Not open for further replies.

stuartgmilton

Programmer
Nov 12, 2001
66
GB
Hey folks, I have a major issue at the moment.

I am geting sent a csv file to import using a dts package. It comes in a standard format, but....

One field keeps having a double quote every so often.

Does anyone have any code that will allow me to convert this to 2 single quotes or 2 double quotes?

The file just doesn't import with these in there.

Cheers,

Stuart
 
There is no easy code solution, but it can be parsed using VBScript if you absolutely must. FileSystemObject is a good starting point. If anyone has written something to accomplish this task I would like to see it as well. We run into this problem from time to time.

Best idea is to talk to the file's supplier and adk them to clean up the " during their extract.

Hope this helps,

Alex

Ignorance of certain subjects is a great part of wisdom
 
I've gotten as far as this:


'**********************************************************************
' Visual Basic ActiveX Script
'************************************************************************

Function Main()
OpenFile()
Main = DTSTaskExecResult_Success
End Function


Function OpenFile()
Const ForReading = 1
Const ForWriting = 2

Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objFileIn = objFSO.OpenTextFile("\\ukgreefs01\Applic$\database\S7-RMS\Imports\CATS\CATS.csv", ForReading)
Set objFileOut = objFSO.OpenTextFile("\\ukgreefs01\Applic$\database\S7-RMS\Imports\CATS\CATSQuotesRemoved.csv", ForWriting)

strContents = objFileIn.ReadLine
Do While objFileIn.AtEndOfStream <> True
objFileOut.WriteLine ReplaceTest(strContents)
strContents = objFileIn.ReadLine
Loop

objFileIn.Close
objFileOut.Close

End Function


Public Function ReplaceTest(str)
Set r = New MRSRegEx.RemoveQuotes()
ReplaceTest = r.ReplaceQuotedComma(str)
End Function

 
That's great that you are using RegEx for this. I think by manipulating your pattern(s) you could probably find a way to not replace the quotes that are preceded by or following a comma.

Let me know how you get on with this, looks promising.

HOpe it helps,

Alex

Ignorance of certain subjects is a great part of wisdom
 
can you not just modify your dts package to transform the base data before import, and just put a replace on the column?

--------------------
Procrastinate Now!
 
Crowley16 -

Do you know of a way to accomplish that without using VBScript?

The only thing I can think of that you might be getting at is removing the " as a text delimiter, but that opens you up to all kinds of problems (what if there is a comma in your data, for example).


Alex

Ignorance of certain subjects is a great part of wisdom
 
well, it's been a while when I've worked with DTS but I definitely remember you can apply sql statements to the imported data before you actually import it...

so in the sql, just put a replace command around the column in question to get rid of ' or change it with ''

--------------------
Procrastinate Now!
 
Crowley16 -

You can only run sql commands on a connection provider that offers SQL support. The text file provider does not have this capability. The only thing I have done this with is connections to Access, and even there you need to be very careful to make sure that there are no compatibility issues in your syntax.

Hope this helps,

Alex

Ignorance of certain subjects is a great part of wisdom
 
I see, so you'd have to first get it into a temp table and then transform it and insert it into the live table then...

--------------------
Procrastinate Now!
 
But you can't import the file, because of the " in one of the rows. And you can't get rid of the text delimiter, because if you have a comma in the data, your file won't parse correctly (it'll have too many columns).

Hence the original question...

File manipulation is a job for vbScript, not T-SQL

Ignorance of certain subjects is a great part of wisdom
 
So, anyone had any bright ideas?

I've just about had enough of battering my head against a brick wall!
 
Your solution with the regular expressions is not working? I will see if I can throw something together later today.

Ignorance of certain subjects is a great part of wisdom
 
Ok Stuart, I ran into whats' probably the same problem you had with the regular expressions. So I was able to cook up a solution using vbScript's string manipulation functions. It involves looping through the lines of the files, and completing these seven steps:

1. Replace "","" with "QuoteCommaQuote"
2. Replace first """ with "StartOfLine"
3. Replace last """ with "EndOfLIne"

4. Replace all remaining quotes with "~"

5. Replace "QuoteCommaQuote" with "",""
6. Replace "StartOfLIne" with """
7. Replace "EndOfLine" with """

Here is the file I tested it on (do NOT test this on a file that you don't have a backup of. I cannot emphasize this strongly enough).

Code:
"File","CSV","Test","Quote"Here"
"File","CSV","Test","Quote"Here"
"File","CSV","Test","Quote"Here"
"File","CSV","Test","Quote"Here"
"File","CSV","Test","Quote"Here"
"File","CSV","Test","Quote"Here"
"File","CSV","Test","Quote"Here"
"File","CSV","Test","Quote"Here"
"File","CSV","Test","Quote"Here"
"File","CSV","Test","Quote"Here"
"File","CSV","Test","Quote"Here"
"File","CSV","Test","Quote"Here"

I saved it as c:\CSVParseTest.txt (as noted in the following script)

Here is the code to replace the quotes (only the quotes in the middle of "Quote"Here" will be replaced.

Code:
'**********************************************************************
'  Visual Basic ActiveX Script
'************************************************************************

Function Main()

Dim strOut
Dim strSearchOn
Dim strFileOut
Dim objFSO
Dim objTS
Dim objWriteTS

strFileOUt = ""
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.GetFile ("C:\CSVParseTest.txt")

'get rid of FSO object
Set objFSO = Nothing

'Open text stream (1 = for reading)
set objTS = objFile.OpenAsTextStream(1)


Do While not objTS.AtEndOfStream

	strSearchOn = objTS.ReadLine
	
	'debugging
	'msgBox strSearchOn
	
	'replace all ",", start and end quotes with placeholders
	'repeat these steps for each line in file
	strSearchOn = Replace(strSearchOn, """,""", "QuoteCommaQuote")
	strSearchOn = "StartOfLine" & Right(strSearchOn, len(strSearchOn)-1)
	strSearchOn = Left(strSearchOn, len(strSearchOn)-1) &  "EndOfLine"

	'debugging
	'msgBox strSearchOn
	
	'replace all remaining quotes with tilde
	strSearchOn = Replace(strSearchOn, """", "~")

	'restore needed quotes
	'again, repeat for each line in file
	strSearchOn = Replace(strSearchOn, "QuoteCommaQuote", """,""")
	strSearchOn = Replace(strSearchOn, "StartOfLine", """")
	strSearchOn = Replace(strSearchOn, "EndOfLine", """")
	
	'debugging
	'msgBox strSearchOn
	
	'append the fixed line to output string
	strFileOut = strFileOut & strSearchOn & vbNewLine

Loop

'dispose of 'reading' text stream object
set objTS = Nothing

set objWriteTS = objFile.OpenAsTextStream(2)
objWriteTS.Write(strFileOut)

'clean up remaining objects
set objWriteTS = Nothing
set objFile = Nothing
	
'function success
Main = DTSTaskExecResult_Success

End Function

Output will look like this:

Code:
"File","CSV","Test","Quote~Here"
"File","CSV","Test","Quote~Here"
"File","CSV","Test","Quote~Here"
"File","CSV","Test","Quote~Here"
"File","CSV","Test","Quote~Here"
"File","CSV","Test","Quote~Here"
"File","CSV","Test","Quote~Here"
"File","CSV","Test","Quote~Here"
"File","CSV","Test","Quote~Here"
"File","CSV","Test","Quote~Here"
"File","CSV","Test","Quote~Here"
"File","CSV","Test","Quote~Here"

This should not be too slow, depending upon your file size. It beats tracking down your stray quote though, right?

Hope this helps,

Alex

Ignorance of certain subjects is a great part of wisdom
 
Stuart - were you able to get this working? I am curious to know how it would perform with a larger file.

Ignorance of certain subjects is a great part of wisdom
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top