Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

CSV Woes

Status
Not open for further replies.

skiflyer

Programmer
Sep 24, 2002
2,213
US
I know this isn't hard, and I'm sure I'll figure it soon... but hoping maybe someone else just has it worked out or knows of a tool for me.

I'm writing a simple script which processes a CSV file... and I'm having a hard time getting all the cells out, specifically in my files some cells are quoted and some aren't.

i.e. I need to handle a line like

0005,12,,BAR,"FOO,BOO"
as
0=> 0005
1=> 12
2=>
3=> BAR
4=> FOO,BOO

And for some reason this is giving me a headache, any suggestions?
 
well that's annoying, putting csv into the function list returns gzeof, and not that...

I'll give it a go, thanks.
 
Yeah, I thought of regex's, but they're tricky for this as it's a multiple split point basically... you have to do all the greedy & non-greedy garbage in one regex if I'm not mistaken.

I just wrote a quick little FSM and parsed the lines by hand, I get away with it because it's a small file and performance doesn't count.
 
i think fgetscsv won't like the inconsistent use of quotes. i'd clean the input data before trying to parse it.
 
Actually fgetscsv had no trouble with the inconsistent use of quotes (which is good, as it's totally allowed in CSV files)... it's just a matter of using the optional encapsulator variable.
 
If it's a small file, couldn't you just parse the file character by character, storing each element into an array. After each comma, you could always test to see if the first character in the next element is '"', and if so, set a var true so that now your ending element character becomes '"' as well, instead of a comma. Then move onto the next record when ya hit the carriage return.



 
skiflier

i think we are talking about cross purposes

if you have a double quote as a string encapsulator then the following will confuse fgetcsv

Code:
name,address
"Justin Adie","Impasse des Laques, Lieu-Dit "Bel-Air", France"

i.e. if the field itself has the encapsulator within its body.
 
xtreme69,

that's basically the FSM I setup

jpadie,
that's not valid CSV though, you'd have to escape the inner "'s
 
precisely!

that's why i said
i think fgetscsv won't like the inconsistent use of quotes. i'd clean the input data before trying to parse it.
to your original data
0005,12,,BAR,"FOO,BOO"

ie the quoting of strings was inconsistent.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top