Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Search with / without didractis signs. 2

Status
Not open for further replies.

Koen Piller

Programmer
Jun 30, 2005
841
NL
Hi,

I would like to have a search procedure which will find names even if they have diactric signs,

So when I search for Aäron, I would like to find Aaron Cohen and also Aäron Baruch,
an when I search for Aaron, I would like to find Aaron Cohen and also Aäron Baruch
and the best thing would be to have a setting which would find only the Aäron Baruch for a search on Aäron

I cant think of anything else than to put an extra field in my dbf where the names are without the diactric signs, also believe there should be a procedure to accomplish this, only how?

Stay healthy,

Koen
 
Hmmmm,

It seems I was trapped with the example/test where locate was used, locate finds, as Mike perfectly pointed out, the first one only. Now have made the following test:

Code:
SELECT FULLNAME FROM NAMES WHERE  upper(ANGLISIZE("Aäron"))$upper(ANGLISIZE(Names.Fullname))

and Yes! it shows 4 reccords:

"Aaron Piller"
"Aäron Poks"
"Aäron Baruch"
"Aaron Cohen"

Perfect.
Griff, I owe you one. Thanks. This "Anglisize" is now part of my default procedure files.
Grr So sorry I firstly overlooked Mike's perfect observation.
And no extra field ness in my dbf

Stay Healthy,

Koen
 
No problem,

Could you add an order to that

Code:
SELECT FULLNAME FROM NAMES WHERE  upper(ANGLISIZE("Aäron"))$upper(ANGLISIZE(Names.Fullname)) order by upper(ANGLISIZE(Names.Fullname))

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
An improvement to Anglisize would be to parameterise the Upper() part it is normally called with:
(I'm pasting it here because TT will mess with the accented characters if I put it in a code block)

FUNCTION ANGLISIZE
PARAMETERS m.STRING,m.CAPITALISE
PRIVATE m.STRING,m.CAPITALISE
IF PCOUNT() < 2
m.CAPITALISE = .F.
ENDIF
IF m.CAPITALISE
m.STRING = UPPER(m.STRING)
ENDIF
m.STRING = CHRTRAN(m.STRING,"ÀÁÂÃÄÅÆ","AAAAAAA")
m.STRING = CHRTRAN(m.STRING,"Ç","C")
m.STRING = CHRTRAN(m.STRING,"ÈÉÊË","EEEE")
m.STRING = CHRTRAN(m.STRING,"ÌÍÎÏÐÑ","IIIIII")
m.STRING = CHRTRAN(m.STRING,"ÒÓÔÕÖ×Ø","OOOOOOO")
m.STRING = CHRTRAN(m.STRING,"ÙÚÛÜÝÞß","UUUUUUU")
IF !m.CAPITALISE
m.STRING = CHRTRAN(m.STRING,"àáâãäåæ","aaaaaaa")
m.STRING = CHRTRAN(m.STRING,"ç","c")
m.STRING = CHRTRAN(m.STRING,"èéêë","eeee")
m.STRING = CHRTRAN(m.STRING,"ìíîïðñ","iiiiii")
m.STRING = CHRTRAN(m.STRING,"òóôõö÷ø","ooooooo")
m.STRING = CHRTRAN(m.STRING,"ùúûüýþÿ","uuuuuuu")
ENDIF
RETURN(m.STRING)

Code:
SELECT FULLNAME FROM NAMES WHERE  ANGLISIZE("Aäron",.t.)$ANGLISIZE(Names.Fullname,.t.) order by ANGLISIZE(Names.Fullname,.t.)


Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Griff, but in case you don't capitalize still already upper case letters need to be processed.

You could just define CHRTRANS which in one case keep the case, in the other case turn all letters uppercase no matter with which case they start.

Besides that, I don't know why you're using a $ operation in your query, Koen, both sides have equally length so just use =, $ operation will not be optimized with any index.

Bye, Olaf.

Olaf Doschke Software Engineering
 
The OP didn't start using the $ operator, I did, just like I started by using locate - for simplicity and clarity.

I appreciate it's not the most efficient way of doing it - but I posted the code for that a bit back in FINDSOMETHING().



Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Still Koen used it, and then also mentioned Mike. I'm not about blaming anyone, just asking for pointing out that there is a better way without just throwing overboard a good reason to use $ he might have. See? Not blaming him or you of making a mistake.

You'd only use $ if there is some "wiggle room", though. And in case of seearchterm vs fullname there surely can be, when you only type in part of the lastname, you could look for it as "beginning with", if names are all stored in the form Lastname, firstname. or you look for it with $ as names could also be Firstname Lastname and you want to find people whos lastname starts with "Last", for example.

So indeed ANGLISIZE("Aäron",.t.)$ANGLISIZE(Names.Fullname,.t.) makes sense, when there are values not starting with Aäron or Aaron in the anglisized version. But an index then feels a bit like a waste.

Bye, Olaf.

Olaf Doschke Software Engineering
 
true

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Fine,

but that has one last rechnical consequene, Griff. If that's the case and Koens search operations aka queries need $ and never = for sake of finding search terms anywhere in the name, then an index on ANGLISIZE(cName) won't be used, is useless and computing ANGLISIZE(cName) in each query is costly. Then there actually should be an extra column populated by ANGLISIZE(cName) in which you do the $ search. This is still cistly in terms of the $ operator, but at least you only remove diacritics once and not with every query.

Bye, Olaf.

Olaf Doschke Software Engineering
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top