Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Replace Text Based On Pattern Matching

Status
Not open for further replies.

Grizzly521

Technical User
Oct 12, 2009
7
US
Hi Im trying to replace text in a file based upon a pattern.

The pattern Im looking for is:

<styleURL>#style0002</styleURL>
<name>#######6105#######</name>

The # are seven alphanumeric characters before and after 6105.

I need it to replace that with:

<styleURL>#style0003</styleURL>
<name>#######6105#######</name>

Any ideas would be appreciated.

Thanks
 
Hi

You not gave enough details, so I let you to finish it :
Code:
awk 'NR>1{if(p~/style0002/&&$0~/6105/)sub(/style0002/,"style0003",p);print p}{p=$0}END{print p}' /input/file
Tested with [tt]gawk[/tt] and [tt]mawk[/tt].

Feherke.
 
Hi

Well, there could be a few questions rose by this sentence :
Grizzly521 said:
The # are seven alphanumeric characters before and after 6105.
I am wondering why that was important.
[ul]
[li]Is the # in front of style0002 also a digit ?[/li]
[li]Are there other style0002 occurrences not enclosed in styleURL tags ?[/li]
[li]Are there other style0002 occurrences not followed by 6105 enclosed in name tags ?[/li]
[li]Are the style0002 and the 6105 always on separate lines ?[/li]
[li]Are the style0002 and the 6105 always on consecutive lines ?[/li]
[li]Is there a formula to find out the new text like newtext=oldtext+1 in your example ?[/li]
[/ul]


Feherke.
 
Thanks for the pointers. I didnt realize how vague I had been.

* Is the # in front of style0002 also a digit ?
No. I overlooked that when I picked # to represent variables
* Are there other style0002 occurrences not enclosed in styleURL tags ?
Yes.
* Are there other style0002 occurrences not followed by 6105 enclosed in name tags ?
No.
* Are the style0002 and the 6105 always on separate lines ?
Yes.
* Are the style0002 and the 6105 always on consecutive lines ?
In the format I showed yes.
* Is there a formula to find out the new text like newtext=oldtext+1 in your example ?
No.

What Im looking to replace always looks like

<styleURL>#style0002</styleURL>
<name>*******6105*******</name>

and needs to be turned into

<styleURL>#style0003</styleURL>
<name>*******6105*******</name>

The * are alphanumeric characters, seven before and after 6105.
 
Hi

Well, then you only have to complete my previous code to restrict the circumstances in which the replacement is performed :
Code:
awk [highlight]--re-interval[/highlight] 'NR>1{if(p~/[highlight]<styleURL>#[/highlight]style0002[highlight]<\/styleURL>[/highlight]/&&$0~/[highlight]<name>[[:digit:]]{7}[/highlight]6105[highlight][[:digit:]]{7}<\/name>[/highlight]/)sub(/[highlight]<styleURL>#[/highlight]style0002[highlight]<\/styleURL>[/highlight]/,"[highlight]<styleURL>#[/highlight]style0003[highlight]</styleURL>[/highlight]",p);print p}{p=$0}END{print p}' /input/file
Some notes :
[ul]
[li]The --re-interval option is [tt]gawk[/tt] only.[/li]
[li]Some [tt]awk[/tt] implementations not handle POSIX character classes like [tt][:digit:][/tt].[/li]
[li]Some [tt]awk[/tt] implementations not handle [tt]{n}[/tt] quantifiers.[/li]
[/ul]
So another question would be, what kind of [tt]awk[/tt] implementation you have ? Anyway, this is a more portable alternative :
Code:
awk 'NR>1{if(p~/<styleURL>#style0002<\/styleURL>/&&$0~/<name>[highlight][0-9][0-9][0-9][0-9][0-9][0-9][0-9][/highlight]6105[highlight][0-9][0-9][0-9][0-9][0-9][0-9][0-9][/highlight]<\/name>/)sub(/<styleURL>#style0002<\/styleURL>/,"<styleURL>#style0003</styleURL>",p);print p}{p=$0}END{print p}' /input/file
Tested with [tt]gawk[/tt] and [tt]mawk[/tt].

If neither that works, you could change some syntax elements and you get the [tt]perl[/tt] code :
Code:
perl -ne 'if($.>1){if($p=~m/<styleURL>#style0002<\/styleURL>/&&$_=~m/<name>[[:digit:]]{7}6105[[:digit:]]{7}<\/name>/){$p=~s/<styleURL>#style0002<\/styleURL>/<styleURL>#style0003<\/styleURL>/}print$p}$p=$_;END{print$p}' /input/file

Feherke.
 
Ill give it a try at work tonight. I really appreciate it.

Itd be running on Solaris 10.
I cant remember what version Im running of the top of my head.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top