Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regular expression using negated class 1

Status
Not open for further replies.

piquet

Programmer
Nov 25, 2002
19
GB
Hi, I need to find a regular expression I can use in AWK to match a string which does NOT contain a sub-string in a fixed position. The string must begin with a pattern, but then NOT contain the sub-string.

Here's an example - the string must not contain "XP" after "420" :

"420" - is ok
"420 " - is ok
"420XP" - is ok
"420AB" - is NOT ok
"420AC" - is ok

I think the answer might make use of "^" to negate a match with "XP" but I can't find the correct regular expression. If anyone can help, I'd be most grateful.

Cheers,

Phil
 
I'm missing something.
You're example is either wrong or unclear.

Take this as an idea:
if (whatever !~ /[0-9]+XP/) {
more code
}
 
Hi marsd - thanks! I forgot to mention that I have to check using ~ (contains) and not !~ (does not contain):

if (code ~ /420 ... something that means anything (or nothing) but "XP"

So, the negating has to be done within the regular expression.

Any ideas?
 
It seems like you made a mistake in your examples. Shouldn't 420XP be not ok and 420AB be ok? How about something like

/420[^X][^P]/ CaKiwi
 
Yes - both of you are correct, I'm sorry the example I give was incorrect in the way you imagined.

The string must not contain "XP" after "420" :

"420" - is ok
"420 " - is ok
"420XP" - is NOT ok
"420AB" - is ok
"420AC" - is ok

/420[^X][^P]/ is the closest I have found so far, but it does not allow "420" or "420 " through. I need something at the end there which means there doesn't have to be anything in positions 4 and/or 5.

Cheers, Phil
 
Yep, that works, thanks very much!

Anyone want to win a gold star for doing it in a single expression - ie. no "|"s !?

Phil
 
why not find what you don't want and skip it. Then proceed with the 420 search. The following is for $0. You'll have to a bit more to re-write for specific field(s). This also accept strings like 420XA & 420AP.

/420XP/{next}
/420/{
# do something
print
}

In essence, this says look for the one to exclude first - if found jump ahead. Otherwise you have to write big, ugly and non-generic expressions for you're global or field-specific needs.

If global ($0) then what about repeating occurances? Are these as simple as your examples? Cheers,
ND [smile]

bigoldbulldog@hotmail.com
 
Hi ND, thanks - yes that is simple. Trouble is I have an application which checks in dozens of places for a single pattern read in from a mapping file to assign a group name based on the pattern of the code :

BEGIN { while (getline <ARGV[1] >0) { patterns[$1]=$2 }
# patterns[&quot;420XP&quot;]=&quot;Group 1&quot;
# patterns[&quot;420..&quot;]=&quot;Group 2&quot; # for example
ARGV[1]=&quot;&quot;
}
{
code=$1
for (x in patterns) {
if (var ~ pattern) { group=patterns[x] }
}
}

The above works ok for 5 character codes but not for the 3 character code, because I want &quot;Group 2&quot; to be anything starting with &quot;420&quot; but it must NOT have &quot;XP&quot; in positions 4 or 5. This check simply allows anything starting with &quot;420&quot;.

Phil
 
Assuming you've done a your exclusion check on 420XP first, then &quot;420..&quot; is better replaced with something like - a regular expression match rather than a string match.

OK=&quot;/420/&quot;
...
if( var ~ OK ){ ... }

You'll might have to do some string length checking, such as [3,5]. Cheers,
ND [smile]

bigoldbulldog@hotmail.com
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top