Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Validating names on vendor data 2

Status
Not open for further replies.

DiamondDave2005

Programmer
Feb 12, 2002
9
US
Hi all,
We receive lists of names from outside vendors, and while we have software to check addresses, we have nothing to check actual names.
These names are not on a database - this is just a list of names, usually around 10,000,000 at a time.
I'm trying to develop an algorithm to do some basic checks, but apart from the obvious, such as checking for illegal characters, etc, I'm not coming up with anything yet.
Anyone have any ideas?
Thanks,
Dave
 
I don't think there is any way to check names. Names can be spelled any way the owner wants them to be. They don't have to be pronounceable. In some jurisdictions, they don't even have to consist of letters. Of course, this doesn't mean that the vender from whom you got the lists has the name spelled right. I (as I am sure almost all of us) have gotten some junk mail with my name misspelled some weird ways.
 
Dave -

Webrabbit is right. Name checking in general is a very difficult thing to do, especially if you have a large population. I work in a hospital setting and we DO force names to meet some specific criteria so that (a) we can find the patient's records realtively easily and (b) so that Medicare and insurance companies will accept electronic claims that enforce certain rules on names.

To do that, we wrote a subroutine that forces names to meet a certain pattern (i.e. LAST, FIRST MI). We don't allow titles (Rev, Mr, etc); we don't allow suffixes (II, JR, etc) (we have date of birth to allow us to separate Jr from Sr etc). We eliminate all non-alphabetic characters except the dash (hyphen). O'brien is OBRIEN. We do some special things with babies because they often don't have names for several days (e.g. SMITH, BABY BOY(GLORIA)). We don't allow single character last names. We do have some routines that audit names to see if they match the sex, but this is simply a warning as many names that you'd think were one sex are applied to the opposite sex.

You'll have to be guided by your own industry standards and business needs on this one.

Regards.

Glenn
 
Yeah, that's pretty much what I thought. I don't think it's worth playing with, as we already have Group 1 software to determine titles, etc, and we do reformat the names as we go thru the processing.
The gender thing is always fun. You don't find any guys named Beverly in the US (at least, I haven't), but you do in the UK.
Dave
 
Additional thoughts on last names:

I'm sure you know about Mc, Mac, and O', but what about D' (from parts of Italy)?

Last names do not always have initial caps: terHorst (from Holland, I think)

Nor do they always have vowels: Ng (from Vietnam)

Nor are they always one word, even hyphenated: Ten Boom (from Holland)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top