Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

A tricky "sed" problem

Status
Not open for further replies.

parbhani

Technical User
Jul 3, 2002
125
GB
Hi Guys ,

I have a comma saperated text, say ,

aaaaa,a123,345,x3e345r,,,wer,234
bbbbbb,f345,89000,,,,,g45,34

in a file.
** Using only sed ** , I want to convert the same to ,

"aaaaa","a123",345,"x3e345r",,,"wer",234
"bbbbbb","f345",89000,,,,,"g45",34

here ,
1) empty commas should remain as they are
2) the values which are alpha-numeric they should only be enclosed by "
3) pure numeric values should remain as they are

Can you please help me ??

Regards
 
Try this command :
[tt]
sed -e 's/,\([[:alnum:]]*[[:alpha:]][[:alnum:]]*\),/,"\1",/g' -e 's/^\([[:alnum:]]*[[:alpha:]][[:alnum:]]*\),/"\1",/g' -e 's/\([[:alnum:]]*[[:alpha:]][[:alnum:]]*\)$/"\1"/g'
[/tt]


In my first attempt to resolve this problem, i used the '|' special character without succes.
I was very surprised to see that my sed supports only Basic Regular Expressions, not Extended Regular Expression.

Jean Pierre.
 
** Using only sed **
What is the the buseness case for this constraint ?
Using awk would be much simplier.

Hope This Help, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884
 
The objective is to test one's regex knowledge - which is most often done with 'sed' [wink]

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Hi there ,

Yes, what vgersh99 said is partially right.
Actually we have a few ".sed" shell scripts which we run using "sed -f startup.sed somefile.txt" . So, I am trying to be consistant.
Also it will open few passages in my barin.

aigles,

have u actually tried your version of logic with a sample file/data ? coz its not working for me .

Please suggest further .

Thanks and Regards
 
sedfile:
Code:
s/^\([^,]*[a-zA-Z][^,]*\),/"\1",/g
s/,\([^,]*[a-zA-Z][^,]*\),/,"\1",/g
s/,\([^,]*[a-zA-Z][^,]*\)$/,"\1"/g
(tested)
 
... but aigles script works with your testdata too.

my solution will match "a34;79","4.47" too.
 
Try something like this:
sed -e 's!\([^,]*[^0-9,][^,]*\)!"\1"!g' </path/to/file

Hope This Help, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884
 
Is there a program, which optimizes sed-scripts for size?
Not in my knowledge. Why ?
 
Why?
Because it would make your job here more easy :)
Perhaps they would perform better?
My script with 3 statements seems a bit more readable to me, but your single-line-solution might be faster, and it could be interesting, to compare raw-scripts and generated.
 
PHV's is faster. Try out time or timex against an artificially large test input a few times for averages. You'll get a feel for speeding up sed scripts.

e.g.

timex <sed command> <input> > /dev/null

PHV uses one command versus three of roughly the same complexity.

Removal of the back reference will make this even faster

sed 's![^,]*[^0-9,][^,]*!"&"!g'

Cheers,
ND [smile]

bigoldbulldog@hotmail.com
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top