Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Westi on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Problem importing data (hard one)

Status
Not open for further replies.

Grumish

IS-IT--Management
Jun 23, 2006
2
US
OK I am importing data to access (too big for excel).
Here is a few template lines I am importing;

1234567890sample data Inc. 000 NW 100TH ST MIAMI FL331501465C055 550033A
1234567890MORE SAMPLE DATA INC 000 NW 100TH ST APT 111 MIAMI SHORES FL331501205C057 010028A

OK here is the problem.
When I used the text to columns function I run into the problem.
The first 10 characters is a phone number. With fixed width I can get that off of there very easy. BUT I can not get the appartment off with fixed width. Also I need to be able to pull the street number off.

If I used delimited {space} then I can not strip off the phone number or the street number.

To compound the issue the spacing is not uniform. So even if I were able to use delimited AND fixed width (which I dont think I can) i still could not get all this data to pull apart.

So I was thinking about writing something in VB but i would not have a clue as to which function to use for something like this.

I know of someone that wrote something for FoxPro that will do this but I know even less about FP than I do Access.

Any help here would be great.
Thanks,
Grumish.
 
Yeah, it sounds like the only way around this headache would be using VBA.

First, I'd just pick a method of importing the data, and go with it - even if it's using fixed widths and just putting it all in one field (hopefully you won't have to go that far).

Next, you'll want to use a few different string functions, as well as string array variables. And, you'll probably need at least a couple conditional statements, and a loop to go through the records. So, something like this:

Oh, and before running the code, I might would create a copy of the table as backup, so you don't have to re-import it, if something doesn't work correctly - will probably take running through a few times.

Code:
'This would go in a Module
'You may also need to set a reference to the Microsoft DAO 3.6 library,
'or whatever library number exists in your version of Access.

Private Sub FixMyData()
  Dim db as DAO.Database
  Dim rs as DAO.Recordset
  Dim fld as DAO.Field
  Dim intField As Integer 'To store the length of the whole field string
  Dim strNoTelephone As String 'to store what is left in field after
     'removing telephone number.
  Dim strTelephone As String
  Dim strMyArray() As String

  Set db = CurrentDb
  Set db = db.OpenRecordset("MyTable")

  Do While Not rs.EOF
    strTelephone = Left(rs.Fields("Field1"), 10)
    strNoTelephone = Right(Field1, Len(rs.Fields("Field1") - 10)
    rs.Edit
    rs.Fields("Field2") = strNoTelephone
    rs.Update
    rs.MoveNext
  Loop

  Set fld = Nothing
  rs.Close
  Set rs = Nothing
  db.Close
  Set db = Nothing
End Sub

Of course, you'll need some more code in there, and this would probably only clean up some of it. But, hopefully that would get you started. Each of the string functions in VBA do have decent help files in Access VBA, as well, to keep you from being totally lost. If you want, try with some of that, and post back. I'm sure myself or any of the others here could gladly help get the whole project complete.

And be forwarned: as it sounds your table is fairly large, so any recordset manipulations could take a while. It woulnd't be a bad idea to step through the code for a few records, and even put a messagebox at the end to let you know when it's finished, as well.

 
Grumish,
Building on the code/concept supplied by [navy]kjv1611[/navy], here is a routine that will open the data file and parse it into records that could be added to a table. For this to work you will need a table with eight fields, since I used ordinals instead of field names it won't matter what the fields are called.
Code:
[navy]Sub [/navy] TestParse()
[green]'strRecord() will hold the output in the following Format[/green]
[green]'0  Phone[/green]
[green]'1  Company[/green]
[green]'2  Street Address[/green]
[green]'3  City[/green]
[green]'4  State[/green]
[green]'5  ZipCode & Plus 4[/green]
[green]'6  Unknown 1[/green]
[green]'7  Unknown 2[/green]
[green]'it is essentially a record[/green]
[navy]Dim[/navy] strRecord(7) [navy]As String[/navy]

[green]'These are For file handling operations[/green]
[navy]Dim[/navy] intFile [navy]As Integer[/navy]
[navy]Dim[/navy] strFile [navy]As String[/navy]

[green]'These are For handling the actual text data[/green]
[navy]Dim[/navy] intChar [navy]As Integer[/navy], intField [navy]As Integer[/navy]
[navy]Dim[/navy] strCurrentLine [navy]As String[/navy]

[green]'Open the file that hAs the data[/green]
intFile = FreeFile
trFile = "C:\TestFile.txt"  [green]'*Change this To match your file/path[/green]
Open strFile [navy]For[/navy] Input [navy]As[/navy] #intFile

Do
  [green]'Get a line of text from the file[/green]
  Line Input #intFile, strCurrentLine

  [green]'The first 10 will always be the Phone number[/green]
  strRecord(0) = Left(strCurrentLine, 10)
  [green]'Increment the field counter[/green]
  intField = 1
  [green]'The phone number hAs been pulled, Loop through the[/green]
  [green]'rest of the characters[/green]
  [navy]For[/navy] intChar = 11 [navy]To[/navy] Len(strCurrentLine)
    [green]'Test For field delimiter, Single space is normal[/green]
    [green]'Double space could be a typo, so use three spaces[/green]
    [navy]If[/navy] Mid(strCurrentLine, intChar, 3) = "   " [navy]Then[/navy]
      [green]'Found three spaces so skip ahead three and[/green]
      [green]'increment the field counter[/green]
      intChar = intChar + 3
      intField = intField + 1
      [green]'This will zip past additional spaces[/green]
      [navy]Do[/navy]
        intChar = intChar + 1
      [navy]Loop[/navy] Until Mid(strCurrentLine, intChar, 1) <> " "
      [green]'Field four contains the State, Zip and Unkown1 so split[/green]
      [green]'inTo three fields and increment the field counter[/green]
      [navy]If[/navy] intField = 5 [navy]Then[/navy]
        strRecord(6) = Mid(strRecord(4), 12)
        strRecord(5) = Mid(strRecord(4), 3, 9)
        strRecord(4) = Left(strRecord(4), 2)
        intField = 7
      [navy]End If[/navy]
    [navy]End If[/navy]
    [green]'This actually captures the data[/green]
    strRecord(intField) = strRecord(intField) & Mid(strCurrentLine, intChar, 1)
  [navy]Next[/navy] intChar
  
  [green]'Parsing is complete, Do something with the data and clear the[/green]
  [green]'array[/green]
  [green]'**For recordset[/green]
  [green]'rs.AddNew[/green]
  [navy]For[/navy] intField = 0 [navy]To[/navy] 7
    [green]'For testing[/green]
    [green]'Debug.Print "strRecord(" & intField & ")", strRecord(intField)[/green]
    
    [green]'**For recordset[/green]
    [green]'rs.Fields(intField) = strRecord(intField)[/green]
    
    strRecord(intField) = ""
  [navy]Next[/navy] intField
  [green]'**For recordset[/green]
  [green]'rs.Update[/green]
[navy]Loop[/navy] Until EOF(intFile)

Clean_Up:
Close #intFile
[navy]End Sub [/navy]
Hope this helps,
CMP

(GMT-07:00) Mountain Time (US & Canada)
 
How are ya Grumish . . .

[blue]It would be a big help[/blue] if you could save your import data as a [blue]delimited[/blue] file first.

[blue]Is this possible?[/blue]

Calvin.gif
See Ya! . . . . . .
 
WOW Thanks guys. I remember just enough about VB to know what the code does when I see it...but had not idea where to start.

The problem with saving it as a delimited first is the data is not fixed. It should be...but it is not =) Thanks USPS...

If I import 100k records and try to do it delimited there will be entires (about 15%) that will skip around into other colums.

So I am going to build this code over the next week (or so) and see how it goes.

Thanks again guys.
 
Want a crude way?

Search and Replace all " " double spaces witho something distintive - the £ character, or something else not in the data.

Do the same again, to replace all all double ££ with a single one. Do this repeatedly until no more are found.

you should end up with

1234567890sample data Inc.£000 NW 100TH ST£MIAMI£FL331501465C055£550033A
1234567890MORE SAMPLE DATA INC£000 NW 100TH ST APT 111£MIAMI SHORES£FL331501205C057£010028A

Now you can replace the £ with a TAB and import as tab delimited.

After that, it should be a simple process to split the phone number either by number of character or the first letter appearing in the field.


 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top