Positive vs. negative look behind pattern search

cptk · Mar 8, 2010

When running the following 2 lookbehind zero-width assertions :
/(?<!\S)dog/ ##Match a non-non-whitespace char
/(?<=\s)dog/ ##Match a whitespace char

... why would I get different results?

That is, the first lookbehind will match if "dog" is at the beginning of the line, the 2nd lookbehind will not match same line ... Why is the 1st one matching the beginnning of the line "^" and second one isn't ???

rharsh · Mar 8, 2010

Given the string 'dog', lets look at the two regexs:

1) /(?<!\S)dog/ -- lets starts at the end, match the string dog. Now the negative-look-behind: before dog, look for a non-space character (there are no characters before dog, so it's false) and if that is false, the negative look behind is true. Since the neg-look-behind and the other characters match, the regex is true.

2) /(?<=\s)dog/ -- again, this matches dog. Now for the positive-look-behind: there needs to be a space character (there are no characters before dog, so this is false) for the look-behind to be true. Since the pos-look-behind is false, the regex is false.

Does that make sense?

Like most things, it's genearlly a bad idea for understanding and readability to use double-negatives.

cptk · Mar 9, 2010

Thanks for the reply ... yes it makes sense and yes, I'm well aware of the "taboo" of double-negatives ... I'm positive about that (lol)!!

I went with this /(?<!\S)dog/ because I could not get the following positive lookback alteration to work:
/(?<=^|\s)dog/ ## or any other format combination using ^ and \s

What's funny, is that /(?<=^)dog/ will work for only dog at beginning of line, but my ultimate goal is to do what this one successfully does /(?<!\S)dog/ - which matches dog either beginning of line or with preceeding space(s).

cptk · Mar 9, 2010

If you test on the other side, this works:
/dog(?=\s)/

Anotherwords, it will return either dog at end of line or dog suffixed with white space(s).

So my question boils down to this:
/(?<=\s)dog/ doesn't recognize the ^ but,
/dog(?=\s)/ recognizes the $.
...Why?

rharsh · Mar 9, 2010

The quick answer is "\n" (the default for $.) is a character that's in the \s regex character class.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Positive vs. negative look behind pattern search

cptk

Technical User

rharsh

Technical User

cptk

Technical User

cptk

Technical User

rharsh

Technical User

Similar threads

Part and Inventory Search

Sponsor