Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Using NAWK on a comma delimited file with commas within the field

Status
Not open for further replies.

djbjr

Programmer
Dec 7, 2001
106
US
I have a NAWK statement that I am using to get data out of a comma delimited text file.

My problem is that one of the double quote enclosed field sometimes contains a comma.

Anyone know the syntax to handle this?

For example you would put optionally enclosed with " if you were using SQL*LOADER.

Any help will be greatly appreciated

thansk
 
try this approach [as there's no 'generic' solution for this problem - multiple threads exist discussing this particular issue - I can post one if there's an interest].

Code:
function splitcsv(text, field,    i, n) {
# parse comma-separated value text into field[1]..field[n], return n

    gsub(/"[^"\\]*((""|\\.)[^"\\]*)*",?|[^,]+,?|,/, "&" SUBSEP, text)
    n = split(text, field, "," SUBSEP)

    # remove superfluous trailing separator
    sub(SUBSEP "$", "", field[n])

    # remove quotes from quoted strings
    for (i in field) {
        # only fields that begin with a quote will end with a quote
        # and possibly have embedded quotes
        if (sub(/^"/, "", field[i])) {
            sub(/"$/, "", field[i])
            gsub(/["\\]"/, "\"", field[i])
        }
    }

    return n
}

 BEGIN {
     #OFS = "\t"
     OFS = "|"
 }

 {
     nf = splitcsv($0, csv)
     for (i = 1; i < nf; i++)
         printf("%s%s", csv[i], OFS)
     printf("%s\n", csv[nf])
 }

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top