sed output substring like

moueza · Oct 8, 2008

hi,
In sed, we can point out a substring in output, saying \1 : and in awk ?

feherke · Oct 9, 2008

Hi

You mean to refer to captured substrings in regular expression ? Like this ?

Code:

[blue]master #[/blue] echo 'Hello World !' | sed 's/\(.\)/([red]\1[/red]) /g'
(H) (e) (l) (l) (o) ( ) (W) (o) (r) (l) (d) ( ) (!)

[blue]master #[/blue] echo 'Hello World !' | awk '{print gensub(/(.)/,"([red]\\1[/red]) ","g")}'
(H) (e) (l) (l) (o) ( ) (W) (o) (r) (l) (d) ( ) (!)

As far as I know, that is supported only in the [tt]gensub()[/tt] function, which is [tt]gawk[/tt] extension.

Feherke.

http://rootshell.be/~feherke/

Annihilannic · Oct 9, 2008

The *ahem* portable awk version... not very short, is it!

Code:

echo 'Hello World!' | awk '
        {
                while (match($0,".")) {
                        printf "(" substr($0,RSTART,RLENGTH) ")"
                        $0=substr($0,RSTART+RLENGTH)
                }
                print
        }
'

Annihilannic.

feherke · Oct 10, 2008

Hi

Nice one, Annihilannic. One minor miss, you completely omitted the part before the match. That can be observed if you not capture everything ( mark all vowels ) :

Code:

[blue]master #[/blue] echo 'Hello World !' | sed 's/\([aeiou]\)/(\1)/g' 
H(e)ll(o) W(o)rld !

[blue]master #[/blue] echo 'Hello World !' | awk '{print gensub(/([aeiou])/,"(\\1)","g")}'
H(e)ll(o) W(o)rld !

But the correction is simple :

Code:

[blue]master #[/blue] echo 'Hello World!' | awk '
        {
                while (match($0,"[aeiou]")) {
                        printf "%s(%s)",substr($0,1,RSTART-1),substr($0,RSTART,RLENGTH)
                        $0=substr($0,RSTART+RLENGTH)
                }
                print
        }
'

The huge problem is if you have to capture more than one substrings ( swap vowels and next character ) :

Code:

[blue]master #[/blue] echo 'Hello World !' | sed 's/\([aeiou]\)\(.\)/(\2\1)/g' 
H(le)l( o)W(ro)ld !

[blue]master #[/blue] echo 'Hello World !' | awk '{print gensub(/([aeiou])(.)/,"(\\2\\1)","g")}'
H(le)l( o)W(ro)ld !

I know, you can do that too with abit of coding.

The next question would be, how far should we go with this, while even [tt]gawk[/tt]'s capability is limited ( replace double letters with power ) :

Code:

[blue]master #[/blue] echo 'Hello Millennium !' | sed 's/\(.\)\1/\1[sup]2[/sup]/g'
Hel[sup]2[/sup]o Mil[sup]2[/sup]en[sup]2[/sup]ium !

I would say, this is simply not a job for [tt]awk[/tt].

Feherke.

http://rootshell.be/~feherke/

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

sed output substring like

moueza

Technical User

feherke

Programmer

Annihilannic

MIS

feherke

Programmer

Similar threads

Part and Inventory Search

Sponsor