How do I find similar entries in a table?

CVENOM · May 27, 2003

What is the easiest way to find rows with a similar value?
Examp: Tablename = customer columns = name, address1,city.

Sample data:
Bob's plumbing | 1234 Oak St | Kalamazoo
Bob's plumbing Inc | 1234 Oak St | Kalamazoo
Bob's plumbing | 1234 Oak St, Ste 123 | Kalamazoo
Chuckie's Toys | 1111 Coral Pl | Miami
Golf world | 9323 Redbud Cir | Limestone
Golfers World | 9323 Redbud Cir | Limestone

The data reflects obvious data entry inconsistencies, and that is where our problem is. Let's say I want to send a letter to Bob's Plumbing. Row 1 and row 2 above would be the same company based on the address, while row three is more than likely also the same but has a suite number where the other two don't. As it stands currently, I would send 3 letters to Bob's based on the three different entries in the DB, where 1 would (for this scenario) be sufficient.

I would like to be able to be able to identify rows 1,2 and 3 and rows 5 and 6 as possible duplicate rows. I'm sure there is either an easy solution that I am overlooking, or a very difficult process that I am avoiding

We've been doing select... order by... to find the similar entries, but this is time consuming, and impractical.

Any help/suggestions would be greatly appreciated.

Rich

tdatgod · May 27, 2003

Hi,
If you are using 4.1 or later you can use a function called

SOUNDEX

this is basically a PHONICS LIKE operator.

It was developed by the U.S. Census bureau to find differnt spelling of similar SURNAMES. Stuff like

Pederson and Pedersen

dnoeth · May 28, 2003

There's no easy solution with SQL.
SOUNDEX was defined for english names, not address lines.

You should look for address scrubing software from companies specialized in data cleansing.
It's expensive, but that will do it right.

Dieter

CVENOM · May 28, 2003

That's what I was afraid of. We are using Group1 for the addresses. I'll see if we can get soundex to work for our purposes.

Thanks for the responses,

Rich

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

How do I find similar entries in a table?

CVENOM

Programmer

tdatgod

Programmer

dnoeth

Instructor

CVENOM

Programmer

Similar threads

Part and Inventory Search

Sponsor