Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SOUNDEX for non-English names?

Status
Not open for further replies.

PaulinaK

IS-IT--Management
Jan 8, 2004
5
0
0
DE
Can anybody please tell me how standard Oracle 9i SOUNDEX will work with non-English names (say, Spanish)?

Also, how does standard Oracle 9i SOUNDEX handle names with hyphens?

Note: Please don't tell me about new/improved/custom-coded matching algorithms. I know one can code anything, if there is time and desire ;-) At this point I am trying to understand what can I get from Oracle "straight-up"

Thanks to all in advance
 
Do you have any specific names in mind?
Have you tried using SOUNDEX?
This seems like it would be a relatively simple experiment to set up.
 
Paulina,

The SOUNDEX function is really misnamed...It gives people the impression that names that sound alike will match using SOUNDEX. A more correct function name would be "CONSONEX" since the function approximately looks at the first letter of a string, plus the next 4 consonants in that string (regardless of language, sound, or pronunciation) to produce a SOUNDEX value.

Even in English, homonyms, which sound alike, should produce identical SOUNDEX if the "sound" was the key. But sound is not the key. Let's look at some quick examples:
Code:
SQL> select soundex('know') know, soundex('no') no from dual;

KNOW NO
---- ----
K500 N000

SQL> select soundex('wright') wright, soundex('rite') rite from dual;

WRIG RITE
---- ----
W623 R300

SQL> select soundex('Juan') Juan, soundex('Hwan') Hwan from dual;

JUAN HWAN
---- ----
J500 H500

SQL> select soundex('Jose') Jose, soundex('Hosay') Hosay from dual;

JOSE HOSA
---- ----
J200 H200
So you see, SOUNDEX for English homonyms might not even be as close as for foreign strings, depending upon the string. I believe that Carp's suggestions are the best: play around with the SOUNDEX function and see the results. Oracle does not have different SOUNDEX functions based upon NLS_LANGUAGE or NLS_TERRITORY.

[santa]Mufasa
(aka Dave of Sandy, Utah, USA @ 20:49 (16Jan04) GMT, 13:49 (16Jan04) Mountain Time)

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top