Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

STRING MANIPULATION (PATRIAL FIELD EXTRACT) 2

Status
Not open for further replies.

cristi22

MIS
Aug 17, 2005
155
US
Hello all -

Have to separate the state & the city from the address_3 field:
My code does not work in case if there is no coma in the field:

NEW YORK becomes NEW YO = city & RK = state or

Here is my code:
Code:
SELECT   address_3,                                         
  SUBSTR(ADDRESS_3,1,CASE                          
                    WHEN POSSTR(ADDRESS_3,',') = 0   
                      THEN LENGTH(RTRIM(ADDRESS_3))-2  
                      ELSE POSSTR(ADDRESS_3,',')-1    
                    END ) as city                         
 ,RIGHT(RTRIM(ADDRESS_3),2) as state
from member

thanks for your help!
cristi :)
 
Is New York a city or a state? If there isn't a comma, would there be a space?
 
thank you for your time!

the value of the address_3 is "NEW YORK"
We don't have a clean data, address_3 could be

"NEW YORK"
"NEW YORK, NY"
"NEW YORK NY
 
Cristi,

This may be able to be done via SQL but I'm not sure, as there doesn't appear to be any easy way to identify when no state has been entered. It may well be easier to perform this process in a programming language rather than in SQL.

But I'd be interested if anybody can see an easy way to do it!

Marc
 
cristi

If the data is that un-clean, I don't see how it is possible for the parsing to be fully automated in any normal language.

Your example showed three possible formats, but obviously there are many other possibilities (e.g. NY NY, NY.NY, N Y , N Y are only a few of them.

Perhaps you will have to understand the data, and determine the normal format used. In my experience the vast majority of such addresses are entered in a standard way.

Hopefully, the second of your examples above is the normal situation. If so, you could 'scan backwards' to the comma, and isolate/split off the state. The state and town could then be compared with lists of valid values.

The valid values lists could allow for mis-spellings and auto-correction by including all normal variations as a search value

Perhaps the non-conforming addresses will need to be extracted and sight-checked/fixed.

If the volumes are high this could be a daunting task.

Good luck

brian

 
thanks for your help & time! I'll try to clean up the data...
How can I select all bad states with no " , " ????


cristi
 
Code:
select *
from member
where address_3 not like '%,%'

When a comma is present, is it always between the state and city?
 
You also could create a table of valid state abbreviations.

Code:
select *
from member
where right(address_3,2) not in (
  select state_abbr from valid_states)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top