Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

FIND FIELDS CONTAINING LETTERS OF THE ALPHABET

Status
Not open for further replies.

coldbohemia

Programmer
Jun 5, 2008
30
US
I HAVE A SCRIPT THAT (TRIES) TO DELETE ALL RECORDS
IN A MYSQL TABLE THAT DO NOT CONTAIN AT LEAST
ONE LETTER OF THE ALPHABET

THE FIELD IN QUESTION (GENERIC_ROW) IS A CHAR() TYPE AND THE DATABASE ENGINE IS INNODB TYPE
THE FILE IS NOT INDEXED.

HERE IS THE OFFENDING COMMAND:

$query="delete from temptables.rfile_temp
where trim(temptables.rfile_temp.generic_row)
not (REGEXP '[:alpha:]+' )";
$result=mysql_query($query) ;


I'VE BROWSED A LOT OF WEB SITES TRYING TO GET SOME CLUE
AS TO WHAT I'M MISSING, BUT NO LUCK.
I HAVE ALSO READ AND RE-READ THE MYSQL MANUAL CONCERNING REGEX, SO IT'S NOT FOR WANT OF TRYING.

THANKS ALL
 
Swap your "alpha" for "_alpha_" and see what happens.

BTW: Please don't use caps for your text, it is considered to be shouting. :)



Trojan.
 
I think that your MySQL statement should be something like
Code:
DELETE FROM temptables.rfile_temp
 WHERE generic_row NOT REGEXP '[red][[/red][:alpha:][red]][/red]+'
Note the extra [red][[/red] and [red]][/red] brackets which are missing in your example.

Andrew
Hampshire, UK
 
With hindsight I think my advice was rubbish.

I must say that from 5 minutes of tinkering I'm stunned at how unpredictable MySQL's regex engine seems to be.

BTW: in principle, the + operator (in this case) would only serve to slow down your match. A regex engine would usually try to satisfy the longest possible match (greedy) and hence would match as many alpha characters each time as it could. If the optimiser is really clever it might notice that it only actually needs to match one character but I think that unlikely.
The reality is then that the code will probably match many times more characters that it needs to with no change at all in the resultant dataset produced. That is, it'll take longer than necessary to execute.




Trojan.
 
thanks lots, guys

trojanwarblade wasn't kidding about the
unpredictability of mysql regex

i finally made it work by comparing the regexp to 0 (zero):
'not' doesn't work

one wierd thing:
if i put only one bracket on each side of the
expression, the query removes all records without letters
including one that it shouldn't have
this record contains letters, numbers, and
a pound sign

the record in question contains this text
store # 1234

this worked:
$query="delete from temptables.rfile_temp
where trim(temptables.rfile_temp.generic_row)
REGEXP '[[:alpha:]]' = 0 ";
$result=mysql_query($query) ;

thanks again
sorry about the upper case trojanwarblade,
didn't mean to shout
 
Hahaha, no problemo.

I initially didn't realise that MySQL regexs are case insensitive so that drove me nuts for a while.
But even knowing that I still couldn't have confidence in them. Plus the docs suggest that they work on a byte by byte basis so if you have multibyte characters the results are "unpredictable".

Very scary.

Yuk!




Trojan.
 
Been a while since I've looked at MySQL but here's a couple of observations.

The reason (I think) you're REGEXP fails when you don't use two sets of brackets is that without the outer brackets it's NOT classed as a character class (so essentially it's looking for an a, l, p, h, :). If you want to only use single brackets you could use [a-z]

I think (from the documentation, I've not got any MySQL stuff set up at work) that the syntax for using NOT is
Code:
where NOT (trim(temptables.rfile_temp.generic_row) REGEXP '[[:alpha:]]')";
As I say, I can't test this but you might even be able to do away with the NOT and =0 by negating the character class ^[^a-z]+$ should match records that don't contain alpha characters (again, as I say, untested from documentation).

You can also get around the case-insensitivity by using BINARY strings.

Hope this helps

HarleyQuinn
---------------------------------
Black coat, white shoes, black hat, cadillac. The boy's a timebomb!

You can hang outside in the sun all day tossing a ball around, or you can sit at your computer and do something that matters. - Eric Cartman

Get the most out of Tek-Tips, read FAQ222-2244: How to get the best answers before post
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top