Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regular expressions

Status
Not open for further replies.

jcstein

Programmer
Mar 3, 2005
4
US
I am trying to validate an email address using regcomp and regexec. However, aparently I am doing it incorrectly.

When I use a simple pattern to match things work well. However, as soon as I start getting into the special escaped regex code it starts to fail. I'm assuming that the pattern is being changed by the compiler due to the escapes.

I've tried double escapes and quadruple escapes. Any clue what I am doing wrong?

The pattern I am trying to use is

char * pattern = "^[a-zA-Z][\\\\w\\\\.-]*[a-zA-Z0-9]@[a-zA-Z0-9][\\\\w\\\\.-]*[a-zA-Z0-9]\\\\.[a-zA-Z][a-zA-Z\\\\.]*[a-zA-Z]$";
 
For every on \ you want, use \\, thus \\ = \\\\ because 2*2=4.

Here's a regex I've seen (albiet a different lib) you may find useful for validating e-mail:
Code:
^[_a-zA-Z0-9-]+(\.[_a-zA-Z0-9-]+)*@[a-zA-Z0-9-]+(\.[a-zA-Z0-9-]+)*\.(([0-9]{1,3})|([a-zA-Z]{2,3})|(aero|coop|info|bitnet|museum|name))$
Which escaped is:
Code:
"^[_a-zA-Z0-9-]+(\\.[_a-zA-Z0-9-]+)*@[a-zA-Z0-9-]+(\\.[a-zA-Z0-9-]+)*\\.(([0-9]{1,3})|([a-zA-Z]{2,3})|(aero|coop|info|bitnet|museum|name))$"
 
Thanks for the reply. But first I tried just doubling the escapes and it didn't work then either. I know the pattern is good. I've used it in Javascript with much success.
 
I'm not familiar with regcomp and regexec, but make sure that they take regular expression formatting the same way you are expecting (the Javascript way). Some regex libraries handle certain characters or even expressions differently.

I can attest that a backslash only needs to be escaped once to appear in a string in C.
 
Thats a point I hadn't thought of that the library isn't supporting all the codes.. mm

Does anyone have a known expression that works for checking email addresses in the standard GNU regex.h?
 
I'm more interested in what flags you are passing to
regcomp and regexec. Sometimes that can be the real
problem.
A basic match function shouldn't be hard to write.
Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <regex.h>
#include <unistd.h>

#define MATCH_FLAGS (REG_EXTENDED | REG_ICASE | REG_NOSUB | REG_NEWLINE)
#define EBUFSZ 100
#define printRegErr(str) {printf("REGEXP ERROR: %s\n",str);}

int matchStr (char *, char *);



int
main (int argc, char **argv)
{

  if (argc != 3)
    {
      printf ("Bad args: Please specify string target([1]), Pattern ([2])\n");
      exit (1);
    }
  if (matchStr (argv[1], argv[2]) < 0)
    {
      printf ("No match\n");
      return 0;
    }
  printf ("%s matched in %s\n", argv[2], argv[1]);
  return 0;
}



int
matchStr (char *str, char *pat)
{
  int err;
  char errbuf[EBUFSZ];
  regex_t matchreg;

  if ((err = regcomp (&matchreg, pat, MATCH_FLAGS)) != 0)
    {
      bzero (errbuf, EBUFSZ);
      regerror (err, &matchreg, errbuf, EBUFSZ);
      printRegErr (errbuf);
      return -1;
    }
  else if ((err = regexec (&matchreg, str, 0, NULL, 0)) == REG_NOMATCH)
    {
      return -1;
    }
  regfree (&matchreg);
  return 1;
}

I can get this to work for most things...
example:
Code:
/reg.out "monkey098.doff@foobiskit.com" "[[:alnum:]]+\.[[:alnum:]]+@[[:alnum:]]+\.[[:alnum:]]+$"
[[:alnum:]]+\.[[:alnum:]]+@[[:alnum:]]+\.[[:alnum:]]+$ matched in monkey098.doff@foobiskit.com
me@haunted:~/Documents/Cscraps> ./reg.out "monkey098.doff@foobiskit.com" "^[0-9]"
No match
me@haunted:~/Documents/Cscraps> ./reg.out "monkey098.doff@foobiskit.com" "^[a-z]+\."
No match
me@haunted:~/Documents/Cscraps> ./reg.out "monkey098.doff@foobiskit.com" "^[a-z0-9]+\."
^[a-z0-9]+\. matched in monkey098.doff@foobiskit.com
 
It was aparently the flags I was passing. I didn't have
REG_ICASE or REG_NEWLINE turned on. Works now :)

My particular pattern doesn't work, but I made one that does. Thanks for the help.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top