Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Positive vs. negative look behind pattern search

Status
Not open for further replies.

cptk

Technical User
Mar 18, 2003
305
US
When running the following 2 lookbehind zero-width assertions :
/(?<!\S)dog/ ##Match a non-non-whitespace char
/(?<=\s)dog/ ##Match a whitespace char

... why would I get different results?

That is, the first lookbehind will match if "dog" is at the beginning of the line, the 2nd lookbehind will not match same line ... Why is the 1st one matching the beginnning of the line "^" and second one isn't ???
 
Given the string 'dog', lets look at the two regexs:

1) /(?<!\S)dog/ -- lets starts at the end, match the string dog. Now the negative-look-behind: before dog, look for a non-space character (there are no characters before dog, so it's false) and if that is false, the negative look behind is true. Since the neg-look-behind and the other characters match, the regex is true.

2) /(?<=\s)dog/ -- again, this matches dog. Now for the positive-look-behind: there needs to be a space character (there are no characters before dog, so this is false) for the look-behind to be true. Since the pos-look-behind is false, the regex is false.

Does that make sense?

Like most things, it's genearlly a bad idea for understanding and readability to use double-negatives.
 
Thanks for the reply ... yes it makes sense and yes, I'm well aware of the "taboo" of double-negatives ... I'm positive about that (lol)!!

I went with this /(?<!\S)dog/ because I could not get the following positive lookback alteration to work:
/(?<=^|\s)dog/ ## or any other format combination using ^ and \s

What's funny, is that /(?<=^)dog/ will work for only dog at beginning of line, but my ultimate goal is to do what this one successfully does /(?<!\S)dog/ - which matches dog either beginning of line or with preceeding space(s).
 
If you test on the other side, this works:
/dog(?=\s)/

Anotherwords, it will return either dog at end of line or dog suffixed with white space(s).

So my question boils down to this:
/(?<=\s)dog/ doesn't recognize the ^ but,
/dog(?=\s)/ recognizes the $.
...Why?
 
The quick answer is "\n" (the default for $.) is a character that's in the \s regex character class.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top