Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Repetitive pattern with different keyword

Status
Not open for further replies.

AA2K5

Programmer
Apr 24, 2002
31
GB
Hi,

I have a repetitive regular expression that I am matching and I am performing the same action over and over.

I don't really want to write out the code every time for each word I want to match. I was wondering if anyone had a tip to reduce code duplication.

Here's a sample of the awk code I am using: -

Code:
/[[:graph:]]+[[:space:]]*\<[red]keyword[/red]\>/ {
    gsub(/\<[red]keyword[/red]\>/, "\n" [red]"KEYWORD"[/red])
}

I'm matching the keyword and putting a newline before it and forcing the keyword to uppercase. Basically I have to do this for a long list of keywords.

Any help would be appreciated

Thanks
 
Put all your keywords in an array and then play with the match,gsub and toupper functions inside a for(... in ...) loop.

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 

I thought this would be the logical approach but I just wasn't sure whether awk allowed all the match, gsub etc. functions when using array elements.

Thanks for your help.
 
Just another quick question.

If I'm in a for loop traversing all the elements of an array, how can I reference an array element in a regular expression?

For example I've tried the following code but I'm pretty sure its wrong because I don't think I'm referencing the array properly in the regular expressions: -

Code:
BEGIN {
    IGNORECASE=1
}

array[1] = "keyword1"
array[2] = "keyword2"

{
  for (x in array)
  {
    if($0 ~ /[[:graph:]]+[[:space:]]*\<[red]array[x][/red]\>/) 
    { 
      gsub(/\<[red]array[x][/red]\>/, "\n" toupper([red]array[x][/red]), $0) 
    }
  }

  printf "%s\n",$0

}

END   {}

When I've used this code, the script adds the newlines & capitalisation but also outputs each line several times.

 
And this ?
BEGIN {
IGNORECASE=1
array[1] = "keyword1"
array[2] = "keyword2"
}
{ for (x in array)
if($0 ~ "[[:graph:]]+[[:space:]]*\<"array[x]"\>")
gsub("\<"array[x]"\>", "\n" toupper(array[x]), $0)
print
}

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
I tried the code in the previous post but unfortunately it did not match the keywords in the array. I had a line like this: -

Some text and here is the keyword. Some more text

I expected this to come out like this when "keyword" was in the array: -

Some text and here is the
keyword. Some more text


However there was no newline and the text stayed the same. Also I got the following warnings when running the script:

awk: C:\awk\test.awk:10: warning: escape sequence `\<' treated as plain `<'
awk: C:\awk\test.awk:10: warning: escape sequence `\>' treated as plain `>'
 
And this ?
BEGIN {
IGNORECASE=1
array[1] = "keyword1"
array[2] = "keyword2"
}
{ for (x in array)
if($0 ~ "[[:graph:]]+[[:space:]]*"array[x])
gsub(array[x], "\n" toupper(array[x]), $0)
print
}

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 

Yep, thanks for that, it works.

Unfortunately without the <>, it is only a partial match on the keyword so "keywords" or "keyword1" or "keyword100" would all match.

But I need an exact match on the keyword so I need to look for a way round it. (assuming there is a way round it)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top