Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Reading a .CSV file

Status
Not open for further replies.
Apr 27, 1999
705
US
Hello,

I am trying to read a .CSV file using the split function. However, some of the data also has a comma in the field. For example,

"john Doe", "Seattle, Washington", "Analyst", "Doe, John"

Any ideas how to read this in successfully without using the Excel app?

Thanks in advance.
Fengshui1998

 
You could string the first and last quotes and make you split delimiter ",". Or use RegEx - which I too little of to share.

-Geates

"Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live."
- Martin Golding

"There are seldom good technological solutions to behavioral problems."
- Ed Crowley, Exchange guru and technology curmudgeon
 
strip, not string

Code:
strLine = objCSVFile.ReaLine
strLine = mid(strLine, 2, len(strLine) - 3)
arrValues = split(strLine, """,""")

-Geates

"Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live."
- Martin Golding

"There are seldom good technological solutions to behavioral problems."
- Ed Crowley, Exchange guru and technology curmudgeon
 
Justin,

Went to that web site. Thanks for the help but it errored when I tried to execute the "SELECT * FROM CSVFile".
IT did not like the format of the file because of the double quotes. To test, I created another CSV file without dbl quotes and it worked. So instead, I wrote this one. Basically, it loops through the file with .ReadLine, then parses each line by locating the first double quote and second double quotes and then extracting that field. It also detects the HEADER line if you have one.

THanks!
fengshui1998

=================================================
Dim ColName, DataArray(100)
dqt = chr(34) ' Double quote

Set CSVtmp = fso_OpenTextFile( CSVFile)
Do While CSVtmp.AtEndOfStream = FALSE
Txt = CSVtmp.ReadLine

' This is the HEADER LINE
If instr(1, txt, "Display Name", 1) > 0 then
ColName = split(Txt, ",")
For i = 0 to Ubound(ColName) - 1
ColName(i) = trim (replace( ColName(i), dqt, "") )
Next

Elseif trim(txt) = "" then
' Do Nothing

Else

StrTmp = Txt
Start = 1
ArrayCount = 0
'
' Parse the line
' Locate the double quotes
Do While strTmp <> ""
Nchar = len(StrTmp)
pos1 = instr(start, strTmp, dqt, 1)
'
' This is the string after the first double quote
NxtStr = right(strTmp, nchar-pos1)
'
' Next position of double quote
pos2 = instr(1, NxtStr, dqt, 1)
strData = left(NxtStr, pos2 - 1 )
DataArray(ArrayCount) = strData
'
'
' Last double quote means it is at the end of the line
If pos2 = len(NxtStr) then
exit do
End If

strTMP = right(NxtStr, lchar - pos2-1)
ArrayCount = ArrayCount + 1

Loop

End If

Loop

 
Nice - like Pacman chompping off segments of the line! Considering the 2 line alternative. It's the same as reading the header line but with a slight modification. Take your intial example:

"john Doe", "Seattle, Washington", "Analyst", "Doe, John"

Code:
'Remove the first and last quote
strLine = mid(strLine, 2, len(strLine) - 3)

john Doe", "Seattle, Washington", "Analyst", "Doe, John

Code:
'Split into fields 
arrFields = split(strLine, "[red]"", ""[/red]") '"" translates into "

john Doe[red]", "[/red]Seattle, Washington[red]", "[/red]Analyst[red]", "[/red]Doe, John

0. john Doe
1. Seattle, Washington
2. Analyst
3. Doe, John

- Geates





"Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live."
- Martin Golding

"There are seldom good technological solutions to behavioral problems."
- Ed Crowley, Exchange guru and technology curmudgeon
 
employees.csv
Code:
id,name,grade
1,"Ezequiel, Justin",1
2,Charlie Sheen,4
3,"Name, Your",8
4,Another Guy,16

script
Code:
On Error Resume Next
Const adOpenStatic = 3
Const adLockOptimistic = 3
Const adCmdText = &H0001

Set objConnection = CreateObject("ADODB.Connection")
Set objRecordSet = CreateObject("ADODB.Recordset")

strPathtoTextFile = "C:\Documents and Settings\Justin Ezequiel\Desktop\test\"

objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
          "Data Source=" & strPathtoTextFile & ";" & _
          "Extended Properties=""text;HDR=YES;FMT=Delimited"""

objRecordset.Open "SELECT * FROM employees.csv", _
          objConnection, adOpenStatic, adLockOptimistic, adCmdText

Do Until objRecordset.EOF
    Wscript.Echo "id: " & objRecordset.Fields.Item("id")
    Wscript.Echo "name: " & objRecordset.Fields.Item("name")
    Wscript.Echo "grade: " & objRecordset.Fields.Item("grade")   
    objRecordset.MoveNext
Loop

output
Code:
id: 1
name: Ezequiel, Justin
grade: 1
id: 2
name: Charlie Sheen
grade: 4
id: 3
name: Name, Your
grade: 8
id: 4
name: Another Guy
grade: 16
 


Justin,

Try using the following in a .CSV

"john Doe","Seattle, Washington", "Analyst", "Doe, John"
"mary Doe","Bellevue, Washington", "CEO", "Doe, Mary"

I tried your above solution as I stated before and got an error. It cannot determine the correct delimiter and it returns with an error, "SELECT * FROM CSVfile"

That's why I decided to write my own code. BTW, here is the correct code. I had an error on my previously published one.


================================

Dim ColName, DataArray(100)
dqt = chr(34) ' Double quote

Set CSVtmp = fso_OpenTextFile( CSVFile)

Do While CSVtmp.AtEndOfStream = FALSE
Txt = CSVtmp.ReadLine

' This is the HEADER LINE
If instr(1, txt, "Display Name", 1) > 0 then
ColName = split(Txt, ",")
For i = 0 to Ubound(ColName) - 1
ColName(i) = trim (replace( ColName(i), dqt, "") )
Next

Elseif trim(txt) = "" then
' Do Nothing

Else

StrTmp = Txt
Start = 1
ArrayCount = 0
'
' Parse the line
' Locate the double quotes
Do While strTmp <> ""
Nchar = len(StrTmp)
pos1 = instr(start, strTmp, dqt, 1)
'
' This is the string after the first double quote
NxtStr = right(strTmp, nchar-pos1)
'
' Next position of double quote
pos2 = instr(1, NxtStr, dqt, 1)
strData = left(NxtStr, pos2 - 1 )
DataArray(ArrayCount) = strData
'
lchar = len(NxtStr)
' Last double quote means it is at the end of the line
If pos2 = lchar then
exit do
End If

strTMP = right(NxtStr, lchar - pos2-1)
ArrayCount = ArrayCount + 1

Loop

End If

Loop
CSVtmp.close

 
FengShui1998,

Glad you got it working... Just some notes about JustinEzequiel's code... It actually does work just fine, is much simpler and easy to follow, and has the added bonus of allowing you to run more advance SQL queries on the data ("WHERE" and "ORDER BY", for example). If you continue to work with CSV files, I encourage you to take a second look at ADODB, and post questions here if you have problems.

If you were referencing fields as JustinEzequiel was, with your own data, you would have had to change the field name to your own field names
objRecordset.Fields.Item("MyFieldName")

Or if there were no column headers to define the field names, you can use the following to get the fields in the first three columns:
Wscript.Echo objRecordset.Fields(0)
Wscript.Echo objRecordset.Fields(1)
Wscript.Echo objRecordset.Fields(2)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top