Formatting a large dataset

hill007 · Jun 14, 2004

I have a series of dataset, with same kind of pattern. Here is an example dataset:
"A","B","C","D","E"
"1","2","3","4",""
"6","7","","9","10"

My final output should be something like:
A B C D E
1 2 3 4
6 7 9 10

Any help appreaciated.
Thanks.

guggach · Jun 14, 2004

sed 's/""/ /g;s/"$.*$"/\1/g;s/,/ /g'

not tested.

guggach

PHV · Jun 14, 2004

Try something like this:
sed 's!""!" "!g;s!^"!!;s!","! !g;s!"$!!' input > output

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244

hill007 · Jun 15, 2004

Instead of sed, if the sceipt is in awk that would be helpful. I have access to awk95.
Thanks.

johngiggs · Jun 15, 2004

hill007,

This should work OK:

awk '{gsub(/,/,"");gsub(/"/," ");print}' filename

Hope this helps.

John

guggach · Jun 15, 2004

for this kind of string manipulation (NO formatting,
NO computing, JUST replacements) the appropriate tool are:
SED, VI, EX. sure awk, perl, c ... can also do it, but you
need a lot more efforts for the same result.
why you insist using AWK ?

guggach

PHV · Jun 15, 2004

The conversion of:
sed 's!""!" "!g;s!^"!!;s!","! !g;s!"$!!' input > output
To awk:[tt]
awk '{
gsub(/""/,"\" \"") # s!""!" "!g
sub(/^"/,"") # s!^"!!
gsub(/","/," ") # s!","! !g
sub(/"$/,"") # s!"$!!
}' input > output[/tt]

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244

Ygor · Jun 15, 2004

This works by setting the field separator...

Code:

[B]BEGIN[/B] [COLOR=#4444FF][B]{[/B][/color]
  [B]FS[/B] = [COLOR=#008000]"([COLOR=#008000]\042[/color],[COLOR=#008000]\042[/color]|^[COLOR=#008000]\042[/color]|[COLOR=#008000]\042[/color]$)"[/color];
[COLOR=#4444FF][B]}[/B][/color]
[COLOR=#4444FF][B]{[/B][/color]
   [B]for[/B] [COLOR=#4444FF][B]([/B][/color]x=2; x<[B]NF[/B]; x++[COLOR=#4444FF][B])[/B][/color] [COLOR=#4444FF][B]{[/B][/color]
      [COLOR=#a52a2a][B]printf[/B][/color] [COLOR=#008000]"%-10s"[/color], $x;
   [COLOR=#4444FF][B]}[/B][/color]
   [COLOR=#a52a2a][B]printf[/B][/color] [COLOR=#008000]"[COLOR=#008000]\n[/color]"[/color];
[COLOR=#4444FF][B]}[/B][/color]

PS: guggach, not all awk users are on UNIX.

Krunek · Jun 19, 2004

This is solution with comma as field separator:

Code:

BEGIN { FS = "," }

{ 
    for (i = 1; i <= NF; i++) {
        l = length($i)
        oStr = substr($i, 2, l - 2)
        if ($i ~ /""/)
            oStr = " "
        printf "%s ", oStr
    }
    printf "\n"
}

It's a variation of Ygor's brilliant solution.

KP.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Formatting a large dataset

hill007

Technical User

guggach

Programmer

PHV

MIS

hill007

Technical User

johngiggs

Technical User

guggach

Programmer

PHV

MIS

Ygor

Programmer

Krunek

Programmer

Similar threads

Part and Inventory Search

Sponsor