Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Parsing data from an access database

Status
Not open for further replies.

pellet

Programmer
Oct 23, 2008
55
US
Hello everyone,

First off, I would like to thank all of you for your contributions to this site. Searching thru the posts here has saved my sanity on multiple occasions. :)

I am just getting started in VB.NET and I am using Visual Studio 2008. What I am trying to do is open an Access Database and go to a specific table. In that table, I go to a specific row and search for the word "NAME:". Everything I posted above I can get to work. Here's where I am having issues. The records I am trying to parse out of the database are, of course, names. The problem is, this is what the database looks like:

NAME:SMITH JOHN A DOB:01-01-1980 PRN:32275664892009 ADD:123 ANYWHERE ST

What I am trying to do is parse out the last name, first name, middle initial, date of birth (DOB), property resource number (PRN) and the address (ADD). I can get this do happen but I am doing it by finding the word NAME: and taking everything after the : and then looking for the first space. After I find the first space, I grab the next character and look for the next space and so on. This works fairly well until I get something like this:

NAME:LA LOW JENNIFER P

What happens then is I am getting LA for the last name, LOW for the first name, and JENNIFER for the middle...

Also...

Sometimes (remember this is an old database that I am trying to pull the data out of), sometimes it has something like this:

NAME:SMITH JOHN A DOB:01-01-1980 PRN:1234567890 OR SMITH JANE A DOB:04-05-1980 PRN:0987654321

I search for the word "OR" (which is also sometimes "AND" - it's hit and miss, so I need to figure out how to search for "OR" or "AND") and if there are two people registered to the property, all is good. If not, the program goes to th next line until it finds the word "OR" and grabs that info, so it is giving me the second person who owns the land but from a different land parcel. So I need to make it stop searching for "OR" (or "AND") within the same line of data.

Ok - after I confused you all now (and myself), here's what I have for code so far...

*******************************************************
strNAME1 = (ds.Tables("Property").Rows(inc).Item(0).ToString)

intFindNAME = Microsoft.VisualBasic.InStr(strRO1, "NAME:")
If intFindNAME <> 0 Then
intFindLast1 = Microsoft.VisualBasic.InStr(intFindNAME, strNAME1, " ")
If intFindLast1 <> 0 Then
strLastName1 = Microsoft.VisualBasic.Mid(strNAME1, intFindNAME + 4, intFindLast1 - (intFindNAME + 5))
intFindFirst1 = Microsoft.VisualBasic.InStr(intFindLast1 + 1, strRO1, " ")
If intFindFirst1 <> 0 Then
strFirstName1 = Microsoft.VisualBasic.Mid(strNAME1, intFindLast1 + 1, intFindFirst1 - intFindLast1)
intFindMid1 = Microsoft.VisualBasic.InStr(intFindFirst1 + 1, strNAME1, " ")
If intFindMid1 <> 0 Then
strMiddleInitial1 = Microsoft.VisualBasic.Mid(strNAME1, intFindFirst1 + 1, intFindMid1 - intFindFirst1)
intFindDOB1 = Microsoft.VisualBasic.InStr(intFindMid1 + 1, strNAME1, " ")
If intFindDOB1 <> 0 Then
strDOB1 = Microsoft.VisualBasic.Mid(strNAME1, intFindMid1 + 1, intFindDOB1 - intFindMid1)
strDOB1 = Microsoft.VisualBasic.Left(strDOB1, +14)
strDOB1 = Microsoft.VisualBasic.Right(strDOB1, +10)
********************************************************

That's just to find the last name, first name, and DOB of the first person. I am sure I am not going about the best way to parse out this info. Could someone give me some pointers as to how to do this better and how to solve some of the issues I mentioned above? I am a complete newbie to VB.NET so any help is appreciated.

Thank you to everyone for reading this and for your time.

Pellet
 
I think I would be tempted to massage the data a bit before trying to parse it.

For example, this:
NAME:SMITH JOHN A DOB:01-01-1980 PRN:1234567890 ADD:123 ANYWHERE ST OR SMITH JANE A DOB:04-05-1980 PRN:0987654321

Replace " OR " with ":MORE:NAME:"
Replace " AND " with ":MORE:NAME:"
Replace "NAME:" with ":NAME="
Replace "DOB:" with ":DOB="
Replace "PRN:" with ":pRN="
Replace "ADD:" with ":ADD="

Your string becomes:

[tt]:NAME=SMITH JOHN A :DOB=01-01-1980 :pRN=1234567890 :ADD=123 ANYWHERE ST:MORE::NAME=SMITH JANE A :DOB=04-05-1980 :pRN=0987654321[/tt]

Why would I do this? Because I can then split the string in to an array.

I would first split on ":MORE:"
Then I would split on ":"

In this case, you would end up with 2 arrays, where the first array's first element would be:
[tt]:NAME=SMITH JOHN A :DOB=01-01-1980 :pRN=1234567890 :ADD=123 ANYWHERE ST[/tt]

And the 2nd element would be:
[tt]:NAME=SMITH JANE A :DOB=04-05-1980 :pRN=0987654321[/tt]

Can you see how the data is a little more consistent now?

After splitting each of the array elements in to an array, I would end up with this:

[tt]
0 0
0 1 NAME=SMITH JOHN A
0 2 DOB=01-01-1980
0 3 PRN=1234567890
0 4 ADD=123 ANYWHERE ST
1 0
1 1 NAME=SMITH JANE A
1 2 DOB=04-05-1980
1 3 PRN=0987654321
[/tt]

The first number would represent the index of the outside array. The second number is the array element of the second index. The numbers are not important. I only included them here so you can see the logic of my advice.

Can you see how parsing each item in the array would be a lot easier now?

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Hi George,

Yes, your logic does make a lot of sense and seems to make parsing a lot easier. Would you parse the data in the actual Access Database and then have the VB.NET program pull out the values or is there a way to have the program actually do the parsing without having to program anything into the Access database itself?

If it can be done in the program, I would like to try it that way but to be honest I don't know how I would do it exactly?

Thanks for the advice - I really appreciate it

Pellet
 
well... if it were me, I would try to normalize the data in the access database. I know you said it was an old database, but if you can, I would encourage you to normalize all the data once and then you would never have to worry about it again. (if it were me)

If you cannot change the structure of the database, then I would probably do it all in the front end. Truth is, I'm not so good at vb.net. Been lurking in this forum just to pick up some pointers.

The VB6 code I wrote to test this is:

Code:
Dim Sample As String
Dim arTemp() As String
Dim arPeople() As String
Dim i As Long
Dim j As Long

Sample = "NAME:SMITH JOHN A DOB:01-01-1980 PRN:1234567890 ADD:123 ANYWHERE ST OR SMITH JANE A DOB:04-05-1980 PRN:0987654321"

Sample = Replace(Sample, " OR ", ":MORE:NAME:")
Sample = Replace(Sample, " AND ", ":MORE:NAME:")
Sample = Replace(Sample, "NAME:", ":NAME=")
Sample = Replace(Sample, "DOB:", ":DOB=")
Sample = Replace(Sample, "PRN:", ":PRN=")
Sample = Replace(Sample, "ADD:", ":ADD=")

arPeople = Split(Sample, ":MORE:")
For i = LBound(arPeople) To UBound(arPeople)
    arTemp = Split(arPeople(i), ":")
    For j = LBound(arTemp) To UBound(arTemp)
        If VBA.Left$(arTemp(j), 5) = "NAME=" Then
            Debug.Print i, "Name", Split(arTemp(j), "NAME=")(1)
        ElseIf VBA.Left$(arTemp(j), 4) = "DOB=" Then
            Debug.Print i, "DOB", Split(arTemp(j), "DOB=")(1)
        ElseIf VBA.Left$(arTemp(j), 4) = "PRN=" Then
            Debug.Print i, "PRN", Split(arTemp(j), "PRN=")(1)
        ElseIf VBA.Left$(arTemp(j), 4) = "ADD=" Then
            Debug.Print i, "ADD", Split(arTemp(j), "ADD=")(1)
        End If
    Next
Next

Converting this to vb.net shouldn't be too difficult.

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Thanks again George for the info.

Any idea as to how I can keep my search limited to only one line of data instead of my program going out and searching for the next "OR" on any given line?
 
And do you (or anyone else) know of a better way for me to do a better search besides looking for spaces?
 
You're talking about parsing the name, right? The best I can say is, "Good Luck". Names very problematic. Some times you may encounter suffixes like Jr, Sr, III. Some people don't have a middle name at all. Some people have spaces in their first name and some have spaces in the last name. I wouldn't be surprised if there were some people that have spaces all over the place. This is precisely why you should normalize the database so that there is a separate column for first name, middle name, and last name. Write an algorithm (or query) to set the data for the "normal" situations, and then let a human eyeball the exceptions.

Remember, the more code you throw at this particular algorithm, the better your results are likely to be. But... no matter how good it is, there are likely to be some circumstances that fall through.

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top