Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Remove commas from text fields in .CSV

Status
Not open for further replies.

Celsoft

Programmer
Aug 27, 2008
17
IE
This must have been done a million times but I can't seem to find the solution

I have a "normal" comma-separated file (.csv) with some text fields in double-quotes. These fields MAY contain a comma, which should be changed to a space.

I want to use sed and/or awk (or gawk) to zap the unwanted commas, leaving the legit delimiting commas intact.

Any help appreciated ... TA

PS If it makes things easier - there are 20 fields, and the only ones where commas in quotes can occur are fields 4 thru 8 incl.
 
cat inputefile | sed 's/,/ /g' > out.txt

This might help

Dereje
 
Configure gawk to use the " as the field separator, and for the field with embedded commas run a gsub function. If your first CSV field is the one you want to filter, it would be $2 when " is the field separator. Then
Code:
gsub(/\,/, " ", $2)
will take care of the comma.
 
Expanding on MadMichael's idea, if you set " to be the field delimiter, then every even-numbered field is going to be the contents of a " delimited field, so you can just loop through them running gsub()s on each:

Code:
awk -F'"' -v OFS='"' '{ for (i=2;i<NF;i+=2) { gsub(","," ",$i) }; print }' inputfile

Annihilannic.
 
Thanks a lot - INGENIOUS - using " as delimiter and substituting in every 2nd string !

Just to add - I got nasty message
"The system cannot find the file specified"
but this was corrected by putting the program in its own file "-f awkscript.txt"

Another tweak was then needed
-F\" -v OFS=\"
and bingo !
Thanks again.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top