Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

sed output substring like

Status
Not open for further replies.

moueza

Technical User
Oct 8, 2008
2
FR
hi,
In sed, we can point out a substring in output, saying \1 : and in awk ?
 
Hi

You mean to refer to captured substrings in regular expression ? Like this ?
Code:
[blue]master #[/blue] echo 'Hello World !' | sed 's/\(.\)/([red]\1[/red]) /g'
(H) (e) (l) (l) (o) ( ) (W) (o) (r) (l) (d) ( ) (!)

[blue]master #[/blue] echo 'Hello World !' | awk '{print gensub(/(.)/,"([red]\\1[/red]) ","g")}'
(H) (e) (l) (l) (o) ( ) (W) (o) (r) (l) (d) ( ) (!)
As far as I know, that is supported only in the [tt]gensub()[/tt] function, which is [tt]gawk[/tt] extension.


Feherke.
 
The *ahem* portable awk version... not very short, is it!

Code:
echo 'Hello World!' | awk '
        {
                while (match($0,".")) {
                        printf "(" substr($0,RSTART,RLENGTH) ")"
                        $0=substr($0,RSTART+RLENGTH)
                }
                print
        }
'


Annihilannic.
 
Hi

Nice one, Annihilannic. One minor miss, you completely omitted the part before the match. That can be observed if you not capture everything ( mark all vowels ) :
Code:
[blue]master #[/blue] echo 'Hello World !' | sed 's/\([aeiou]\)/(\1)/g' 
H(e)ll(o) W(o)rld !

[blue]master #[/blue] echo 'Hello World !' | awk '{print gensub(/([aeiou])/,"(\\1)","g")}'
H(e)ll(o) W(o)rld !
But the correction is simple :
Code:
[blue]master #[/blue] echo 'Hello World!' | awk '
        {
                while (match($0,"[aeiou]")) {
                        printf "%s(%s)",substr($0,1,RSTART-1),substr($0,RSTART,RLENGTH)
                        $0=substr($0,RSTART+RLENGTH)
                }
                print
        }
'
The huge problem is if you have to capture more than one substrings ( swap vowels and next character ) :
Code:
[blue]master #[/blue] echo 'Hello World !' | sed 's/\([aeiou]\)\(.\)/(\2\1)/g' 
H(le)l( o)W(ro)ld !

[blue]master #[/blue] echo 'Hello World !' | awk '{print gensub(/([aeiou])(.)/,"(\\2\\1)","g")}'
H(le)l( o)W(ro)ld !
I know, you can do that too with abit of coding.

The next question would be, how far should we go with this, while even [tt]gawk[/tt]'s capability is limited ( replace double letters with power ) :
Code:
[blue]master #[/blue] echo 'Hello Millennium !' | sed 's/\(.\)\1/\1[sup]2[/sup]/g'
Hel[sup]2[/sup]o Mil[sup]2[/sup]en[sup]2[/sup]ium !
I would say, this is simply not a job for [tt]awk[/tt].

Feherke.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top