Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Import CSV Text Parsing Quote Problem

Status
Not open for further replies.

sxschech

Technical User
Jul 11, 2002
1,033
US
Hello,
I am using the default import (File, Get External Data, Import) utility. I have encountered a few errors due to double quotes. Data are surrounded by quotes and comma separated. The problem happens when the description field contains double quotes.

[tt]
"38402","Main Entry Room","10/29/2004 07:12:22 AM"
"38403","Main Entry Room 1/2"","10/29/2004 07:12:22 AM"
[/tt]

The second line ends up merging the date with the description field.

The pattern seems to be ""," is a problem while ,""," is fine. Is there a tool, prior to import that can look for ""," but not ,""," (to put another way, only those items that have 2 double quotes without a comma before the 2 double quotes) and replace with single quotes or is this something that would need to be set up in vba to parse rather than using the import wizard.

Thanks for your assistance
 
I'm dealing with a similar situation, yet unfortunately the solution to yours is much easier since you said your pattern seems to be 2 double quotes in a row (my data has one double quote out of place, within one of the data fields to signify inches--and there's no pattern).

If you are going need to import this dataset often, you might want to investigate something more automated. However, short-term you can simply use Notepad or Wordpad and do a search/replace (search for "" and replace with ").

I hope this helps.

Good luck!!!
K
 
Kallak,

Thank you for your response. I tried it and found out that unfortunately, that won't work. Turns out that there are valid fields that have double quotes because they are empty fields.

[tt]"1/19/2005 12:00:00 AM",[red]""[/red],"1/7/2005 12:00:00 AM"[/tt]

Also, after reading your response, it could be possible in the future that the quote may appear in random location within the text field, just that in my quick look at the data, it appeared to be consistent. If you come up with a solution to your problem, would appreciate learning of it and thanks again for your assistance.
 
Do you have any control over the export format? Maybe you can specify a different delimiter for the text fields.

I try not to let my ignorance prevent me from offering a strong opinion.
 
You could try doing a search and replace, replacing "," with <tabcharacter> or | and then import specifying either the tab or pipe as the delimeter.

NB the search string must include both sets of double quotes with the comma in between and the replace string must not include any double quotes. During the replace process, remove the leading and trailing double quote.

Hope this helps.
 
I have to deal with problems like this all the time. The Import Text Wizard is a nice tool if the structure on the imported file is reliable, but not otherwise.
Two different approaches to this. For some reason the Excel Text Import Wizard has always been more flexible than the Access one. You can use that with comma as the delimiter and Text Qualifier of {none}. That would import the data and you can then use Search and Replace as suggested by kallak and finally link/import the spreadsheet to Access. (earthandfire's approach would also work).
This isn't much use if you need to do this regularly or there are too many lines to go into Excel. If that is the case I would deal with this by using code to open the file as a text file and read each line at a time. Then use Split with comma as the separator to put each field into a different element of an array. Loop though that array and use Replace() to replace quotes with null. Finally write that to a table.
Simon Rouse
 
DrSimon, on the whole your approach would work - but what if a comma is part of the field data. Whilst I agree my solution is cumbersome, it should circumvent all problems.
 
earthandfire you're quite right, but my impression here is that the number of fields is reliable, it's just the content that could be a problem. We don't know if this is a job that needs to be repeated, nor if sxschech has any control over the export file itself. If I'm ever in the position to control the export file I NEVER use 'comma' delimited or quotes, because someone will always use a comma in the field, just like someone can enter a quote. Tab is usually fine as you suggest.
 
A starting point.
Say you have s a String var containing the following value:
"38403","Main Entry Room, 1/2"","10/29/2004 07:12:22 AM"
To replace the embedded doublequotes by single quotes, taking care of embedded commas:
a = Split(Mid(s, 2, Len(s) - 2), """, """)
For i = 0 To UBound(a)
a(i) = Replace(a(i), """", "'")
Next i
MsgBox """" & Join(a, """, """) & """"


Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
I've recently had a similar problem, compounded by the .CSV file having THREE header lines, only the third contained the field names and some of the field names contained full stops ! - You can imagine what Access thought of full stops in field names !

So I have to write the following:
I creates a temp .CSV file and corrects the data errors one line at a time, then imports from the temp file and then deletes it.

Code:
Public Sub ImportFiles(SourceFile As String, TargetFile As String _
                         , Line1Check As String, TargetTable As String _
                         , ImportOkay As Boolean)
' # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
Dim gets As New clsGeneralGets
Dim fso As Object
Dim fRead As Object
Dim fWrite As Object
Dim strDataRecord As String
Const ForReading = 1
Const ForAppending = 8

On Error GoTo Err_ImportFiles
Set fso = CreateObject("Scripting.FileSystemObject")
Set fRead = fso.OpenTextFile(gstrRoot & SourceFile, ForReading)

' Make sure that the copy to file does not exist already
If gets.FileExists(gstrRoot & TargetFile) Then
    Kill gstrRoot & TargetFile
End If
Set fWrite = fso.OpenTextFile(gstrRoot & TargetFile, ForAppending, True)


strDataRecord = fRead.readline   ' Line 1 - Not required

strDataRecord = fRead.readline   ' Line 2 - Not required

fWrite.writeline (Replace(fRead.readline, ".", "")) ' Field Names line
' Now transfer rest of file removing "-" from data
Do
    strDataRecord = fRead.readline
    strDataRecord = Replace(strDataRecord , "-", "")
    strDataRecord = Replace(strDataRecord , "None", "0")
    fWrite.writeline (strDataRecord)
Loop

FinishedRead:
fWrite.Close

' Empty out the old data from the quarantine table

DoCmd.RunSQL "DELETE * FROM " & TargetTable

DoCmd.TransferText acImportDelim, , TargetTable _
                        , gstrRoot  & TargetFile, True


Exit_ImportFiles:
Exit Sub

Err_ImportFiles:
If Err.Number = 62 Then
    Resume FinishedRead
Else
    MsgBox Err.Description & vbLf & Err.Source & vbLf _
    & "In frmImport.ImportFiles", , Err.Number
    Resume Exit_ImportFiles
End If
End Sub


You'll need to change the
strDataRecord = Replace(strDataRecord , "-", "")
type lines to whatever you need to modify in your data.


'ope-that-'elps.



G LS
spsinkNOJUNK@yahoo.co.uk
Remove the NOJUNK to use.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top