Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regex boundary matchers?

Status
Not open for further replies.

cpjust

Programmer
Sep 23, 2003
2,132
0
0
US
Hi,
I'm trying to write a regex expression to test if the word "all" is somewhere in a comma delimited list. I'm not sure if I should use this:
Code:
if ( sValue.matches( "(/A|,)all(,|/z)" ) == false )
or this:
Code:
if ( sValue.matches( "(^|,)all(,|$)" ) == false )

Assuming you only have 1 line in a string, is there a difference between using '^' and '/A' or '$' and '/z'?

Also, what's the difference between /z and /Z?
 
At least you should use \A instead of /A and \z instead of /z.

Why don't you just test your regex?

Usually, instead of == false, you would use the not-operator:
Code:
if (! sValue.matches ("(^|,)all(,|$)"))
notice, that you need a second \ to mask the backslash.

Code:
public class RegExAll
{
	/** */
	public RegExAll (String param)
	{
		if (! param.matches ("(\\A|,)all(,|\\z)" ))
		{
			System.out.println ("no 0 \\A\\z");
		}
		else System.out.println ("m 0 \\A\\z");
		if (! param.matches ("(/A|,)all(,|/z)" ))
		{
			System.out.println ("no 1 /A/z");
		}
		else System.out.println ("m 1 /A/z");
		if (! param.matches ("(^|,)all(,|$)"))
		{
			System.out.println ("no 2 ^$");
		}
		else System.out.println ("m 2 ^$");
	}

	/** */
	public static void main (String args[])
	{
		String param = ",all,";
		if (args.length == 1)
		{
			param = args[0];
		}
		new RegExAll (param);
	}

	/** */
	public static void usage ()
	{
		System.out.println ("Usage:\tjava RegExAll param");
	}
}
Testing on the shell (bash):
Code:
 for s in $(echo "all" ",all" "all," ",all," "calling") ; do echo $s; java  RegExAll  $s ; done
all
m 0 \A\z
no 1 /A/z
m 2 ^$
,all
m 0 \A\z
no 1 /A/z
m 2 ^$
all,
m 0 \A\z
no 1 /A/z
m 2 ^$
,all,
m 0 \A\z
m 1 /A/z
m 2 ^$
calling
no 0 \A\z
no 1 /A/z
no 2 ^$

don't visit my homepage:
 
I already got it working, but thanks.

Yeah, I had the wrong slashes. Doh!
I also forgot about using double backslashes, which is probably why it was failing with \A & \z. I got it working with ^ and $ though. BTW, do you know what the difference between \z & \Z is?

As for using ! instead of == false, I prefer to be explicit, since the ! is easy to miss if you're looking over the code quickly. Plus it just looks too much like old-school C, which I hate.
 
If you put the blanks in conjunction with brackets right (outside), you will see the Not-Operator immediately, because now there is a blank nearly at the beginning of the bracketed expression:
Code:
if (! something) return foo;
if (something) return bar;
I didn't use \z and \Z till now. From the javadocs, i guessed something, which was wrong - so I don't have any idea.


don't visit my homepage:
 
Well that's just a style preference, and it's not my preference. :)
 
500 years of bookprinting experience?? [ponder]

One of my previous companies actually had it in their code style guidelines to explicitly use comparison rather than the NOT operator (or lack thereof). Typing a few extra characters is a small price to pay for extra clarity.

What if you had a complex statement? Which would be easier to read, this:
Code:
if ( !fin.is_open() || flag && num == 5 )
or this:
Code:
if ( (fin.is_open() == false) ||
     (flag == true) &&
     (num == 5) )
 
The second statement is easier to read for newbies. I would not recommend it because it balloons out the code with a nervous sort of stuttering redundancy.
 
You will in end up in
Code:
if (((flag == true) == true) == true)

If you don't understand 'flag', you can't understand flag==true.
@cpjust:
Bookprinting styleguides are meant to make text readable.
And they rule:
Normally, you put a blank after a special character like: "foo, bar! Right?" but not before. For brackets and " you put the blank outside, not inside like "(this is foo) bar [7])".
Code:
if (! fin.is_open () || flag && num == 5)

Exception: when a control-character is following: The brown fox (who is just an example). No blank in front of the dot.


don't visit my homepage:
 
Well if you're going to use ancient book typeset rules, then you should have 2 spaces after the end of a sentence (i.e. after a period, exclamation mark, question mark)

But this is source code, not English, German or whatever...

Anyways, this is about as useless as arguing whether or not you should use Hungarian notation or whether you should use // for comments or /* */...
 
It is source code, but we have to read it. It's the same characters and the same eyes.

Code:
if(!fin.is_open()||flag&&num==5)
if (! fin.is_open () || flag && num == 5)
if ( ! fin.is_open ( ) || flag && num == 5 )
Helping the eye to tokenize the expressions is the idea of indentation, layout and blanks - isn't it?

don't visit my homepage:
 
But English has completely different syntax than C++ or Java.
Math also has different rules.
Being explicit about what you mean helps everyone. You don't usually work on a team with all experts. Some people might have 10-20 years of experience, and others might be straight out of school.

I've found LOTS of bugs in my time that could have been easily avoided if people would have used parentheses to say exactly what they meant. Not everyone has all the operator precedences memorized, which leads to bugs like this (in C++ at least):
Code:
if ( value == ret = func() )
which could have been avoided by doing this:
Code:
if ( value == (ret = func()) )
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top