Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

instr() - ST. vs. Street, etc. Need some ideas to compare them. 3

Status
Not open for further replies.

CptCrusty1

Programmer
Feb 17, 2005
216
0
0
US
I have two address fields "BNA" (means Borrower New Address) and MADDR1 (mailing Addr). The BNA is returned by my vendor and I need to compare it to the MADDR1 field. If the address is different, they've given me good info and can bill me. If not, no charge. My problem is I can only get so far with basic IF's, etc. i.e.

IF BNA <> MADDR1, then bill me.
IF BNA = MADDR1, no charge.

However, if I send them:
101 Smith Street, and they return
101 Smith St., those arguments are not going to pick that up. Any ideas on how to eliminate that as a problem? I have to search through about 200000 records... so I'm not doing it by hand.

I've considered using instr, or StrComp, but I'm just not sure how I can implement it.

Thanks
Crusty


I live to work and I work to live.
 
best way is to use replace to remove all those dodgy entries...

e.g. replace(addStr, "Street", "st.") and then just compare on "st."

or better yet, replace both "street" and "st" with "" and compaire directly.

however, there is no real way to cover all the possible garbage that users put in free text fields, e.g. "Street.", "St.", "st", "str" or other variations...

--------------------
Procrastinate Now!
 
I was doing that just now.. just wasn't sure how many iterations I was going ot have to make....

I live to work and I work to live.
 
iterations?

you mean you're looping though these records?

why not put this directly into a query, and do all the comparisons there?

--------------------
Procrastinate Now!
 
i think itterations refers to the number of variations on the number of different 'keywords' to compare / change ... ave, rd, blvdm 21044 (vs 21044-3907), apt, ste .., ad naseum ...

An approach MIGHT be to use ye olde soundex or a variation there of. Originally developed to conpare phonetic variations in proper names, it should work reasonably well applied to the individual words of an address entry. there are some examples of the function in htese (Tek-Tips) fora, use search / advanced search to locate some. Implemetation is, of course, a bit more involved than you will see in any of the threads here,



MichaelRed


 
A idea I have not thought through ...

Perhaps you could do a match rating? First split each address by word, the count the number of matches against the number of words? Hopefully this would give a scale from 0/count, almost certainly a new address, to count/count almost certainly an existing address.
 
Counting matches would identify/eliminate the exact dups, it is the partial ones which are the crux of this issue. soundex is specifically intended to find the partial matches.




MichaelRed


 
All,

Thanks for your input. You have some great ideas. MichaelRed has the right idea, i.e. St, Rd, Ct., Ave, bla bla bla. I've put several "iterations" per Crowley16's suggestion (and my simulatious idea) and it appears to work "OK" for now. Soudex is something worth checking out for increasing the granularity of the searching. I will search that soon.

Thanks again.
Crusty.

I live to work and I work to live.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top