Bob,
The original code was based on the original data set. In that set most of the data was relatively clean and the information was seperated by commas. So it may look something like
Subdivision information, block information, lot information, etc.
The strategy was to split these blocks and read them. if they contained block information then stick that in the block field, lot info stick in the lot field, etc.
This strategy is still good, but only part of a bigger strategy. Your data is about as raw as it gets. It looks like it was scanned with lots of error. So if this was me I would apply lots of different approaches to narrow the data down and continously clean it. There is no way to do this all at once. But lots of update queries and continue refinement may take what would be 100s of hours into a few hours.
To make this easier to work with I would create a new table that links to your primary table. This can be done with a make table query.
1) I would make a new table with
tblCleanProperties
saleNumber (foreign key)
descriptionOld (property description field)
Then I would add these fields
descriptionClean (clean version of the description)
blnLot (These are all yes no boolean fields)
blnBlock
blnSubdivision
blnCSM
blnTownShip
blnParcel
blnCondo
blnSection
blnAttachment
strLot (these are all the text fields containing that word)
strBlock
strSubdivision
strCondo
strAttachment
Sub-CK
valLot
valBlock
This way I can do lots of work on this table without messing with the main table. I then can then relate back to the main table by the primary key (which appears to be saleNumber)
1) Do lots of update queries to clean up the old description and put in descriptionClean. This is preprocessing. Doing it this way I will eventually work with the clean data, but have the old data available. Here are some examples.
Update the attachement field
Code:
UPDATE qrySubDivision SET qrySubDivision.descriptionNew = "SEE ATTACHED", qrySubDivision.blnAttachment = True
WHERE (((qrySubDivision.descriptionOld) Like "*Atta*"));
This puts "SEE ATTACHED" in the new description field and puts a check mark in the yes no "blnAttachment" field
Now it will be a lot easier later to look for attachments
Code:
descriptionOld descriptionNew
LIGGAL DESCRIPTION ATTACHED SEE ATTACHED
SBB ATTACHBD SEE ATTACHED
SDEE LEGAL ATTALHED SEE ATTACHED
SEB ATTACHBD SEE ATTACHED
So now go back and do some looking at the data and see what you missed. I still see some that have "Addendum", and I assume logically get treated the same way
So
descriptionOld
SEE ADDENDUM A
SEE ADDEND W A
SEE ADDENDUM
ADDENDLLWL
Code:
UPDATE qrySubDivision SET qrySubDivision.descriptionNew = "SEE ATTACHED", qrySubDivision.blnAttachment = True
WHERE (((qrySubDivision.descriptionOld) Like "*ADDEN*"));
So look again trying "*See A"
and still
Code:
descriptionOld
SEE ATACHED
SEE ATTHCHE17
SEE ATTR4CHED
SEE AMACFGD
SEE AMACHM M
SEE ARTACL#D
SEE ATJ77CHED
SEE ATT14CHED
SEE AIJACHED
SEE ATTHCH6D
SEE AITACHED LEGAL DESCRIMICJN
SEE ATIIACHED
SEE A MACIMMVR
run another update
Get the idea? Search, Update.
Eventually I will have much cleaner description field to work with. Plus a boolean field to quickly query. But this may require a couple of hours, but it will be worth it in the long run.
Some may not be able to get clean so then just dump the old description field into the cleanDescription field (where the clean description field is Null)
Then Start identifying if it contains block, lot, subdivision etc.
Now you will have a boolean field that can be used to determine if you need to worry about it.
Code:
UPDATE tblCleanProperties SET tblCleanProperties.blnSubDivision = True
WHERE (((tblCleanProperties.descriptionOld) Like "*SUBDI*"));
Code:
UPDATE tblCleanProperties SET tblCleanProperties.blnBlock = True
WHERE (((tblCleanProperties.descriptionOld) Like "* block *" Or (tblCleanProperties.descriptionOld) Like "* blk *"));
after doing all that you can run the code. Has to be modified a little now, but basic idea. Now however if there is a yes in subdivision and an NA in the text field you can query and see the problem.
Once done you can use the boolean fields to do update queries on the sub division field. (if blnCSM = true then strSubdivision becomes N/A)
Then you can do a series of update queries to get
strBlock: "Block (14)"
to
valBlock: "14"
You will see that will take a lot of work as well.