Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

question on none greedy reg expression

Status
Not open for further replies.

jescat

Technical User
Jul 29, 2004
32
US
Hello,

I have an xml file all on one line. I would like to separate all tags on separte lines. I am having problems creating the proper regex to grab the shortest pattern. Here is my example file:

<a>who cares</a><b>not me</b>

my script:

gsub(/<\/(.*)?>/, "n")

outfile:

<a>who caresn

Conclusion:

The regex is grabbing the longest possible string rather than the shortest. I thought my syntax for non-greedy pattern matching was correct, but apparently not.

Any assistance would be appreciated.
Thanks

 
Sorry correction on script:

gsub( /<\/(.*)?>/, "&n")
 

In awk, regular expressions are always greedy. Try this:
Code:
gsub( /<\/[^>]*>/, "&\n" )
[tt][^>][/tt] means any character except [tt]>[/tt].
 
Thanks alot, this worked. I thought non-greedy expression where supported. I've seen documentation about them, perhaps referring to regular expressions in general. Anyways thanks, appreciate it
 
And what about this ?
gsub(/>/,">\n")

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top