Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Westi on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Data Mining Text to Access w VBA?

Status
Not open for further replies.

lakehouse

Technical User
May 17, 2004
45
US
I have a patterned text report of numerous voicemail distribution lists that I want to move to Access. My goal is to 1) look up an individual's mailbox and identify all distribution lists it could receive messages from and 2) list all individual mailboxes a distribution list will send to.

The text report is a mix of page headings, list number and name, then sections of a columnar report with "Nested System Lists" and "Mailboxes". A single system list can contain either Nested or Mailboxes, or both. The Nested and Mailbox columns vary row length, and at times rows will extend to another page of the report where the column headings repeat; finally ending with "Total Entries on list 5551212: ..." before another list number begins.

I know I could extract this data with Procomm Aspect, but since I want to further import and manipulate the captured data with Access, will VBA be able to mine the data looking for keywords and selecting the data out of the text file?

Is there a sample of VBA code which would give me a start to how I can convert this text report?
 
lakehouse,
Yes. It sounds like your routine will need to capture and normalize the data to undo the formating (unlike a delimited file). Can you post a section of the file? (Please dummy up the data a little to protect the innocent.)

CMP

[small]For the best results do what I'm thinking, not what I'm saying.[/small]
(GMT-07:00) Mountain Time (US & Canada)
 
CautionMP- Thanks for replying...

Here is a sample:
________________________________________________________

Date: JUN 05 2006 Time: 09:42A

REPORTS
SYSTEM LIST MEMBERS



LIST NUMBER: 5551212
LIST NAME: Breaking News


NESTED SYSTEM LIST MEMBERS

LIST NUMBER LIST NAME DEPT. COMM. #
----------------------------------------------------------------
5550001 New York NEWYORK
5550002 Atlanta Atlanta
5550003 Chicago Chicago
5550004 Dallas/Ft Worth DFW
5550005 New Orleans NewOrl
5550006 Los Angeles LosAngl
5550007 Seattle Seattle



MAILBOXES AND NETWORK NODE MEMBERS

NODE MB NUMBER NAME DEPT. COMM. #
-------------------------------------------------------------------
4 2224163 Johnson, Paul NewYork
4 2224101 Curtis, Helene NewYork
12 6205 Hilton, Paris France
16 4104 Adams, Sam SLC
6488 Disney, Walt Cartoon
6490 Mouse, Micky Cartoon
6501 Jones, Indiana Explore
6520 Perkins, Marlin Animals


TOTAL ENTRIES ON LIST 5551212: 15


< next page repeats with this header >

Date: JUN 05 2006 Time: 09:42A
_______________________________________________________
(Due to Tek-Tips proportional font, the columns are not lining up with the headers first left-hand character.)

As I stated, a list can have Nested Lists, Member Lists, or both. They will always appear in the order shown. If a list is longer than a page, the Nested or Member list header will reprint on the next page, depending on which list was interrupted by the page break.
 
lakehouse,
Keep in mind that I'm probably forgetting more than I remember to tell you.

It appears that you have two different files that are both linked to the LIST NUMBER and LIST NAME so what I did was grab both of these and add them to the actual data for each file format. Since I don't know your table structure or how the data should be normalized, I wrote the data to two seperate output files (CSV) that you can then Link to or Import from in Access to further process the data. Here is the format of the two files:

NESTED SYSTEM LIST MEMBERS (C:\NestedSystemListMemebers.txt)
[tt]"LIST NUMBER","LIST NAME","LIST NUMBER","LIST NAME","DEPT.","COMM. #"[/tt]

MAILBOXES AND NETWORK NODE MEMBERS (C:\MailboxAndNetworkNodeMembers.txt)
[tt]"LIST NUMBER","LIST NAME","NODE","MB NUMBER","NAME","DEPT.","COMM. #"[/tt]

With a little modification you could alter the routine to dump the data directly into your Access database. Here's the routine (be sure to update the input file name to match the location and name of your actual file):
Code:
[b][green]'This global declaration ensures that the LIKE statments work correctly[/green]
Option Compare Text[/b]

[b]Sub TransformPhoneData()[/b]
'Change the following line to match the loacation of your
'actual input file
Const cFileInputName As String = "[i]C:\VoicemailReport.txt[/i]"

'These are the tags that will tell the routine where it is in
'the input file
Const cListNumberHeader As String = "LIST NUMBER:"
Const cListNameHeader As String = "LIST NAME:"
Const cNestedSystemListMemebersHeader As String = "NESTED SYSTEM LIST MEMBERS"
Const cMailboxAndNetworkNodeMembers As String = "MAILBOXES AND NETWORK NODE MEMBERS"

'These will be used to process the actual files
Dim blnWriteToFile As Boolean
Dim intFileInput As Integer
Dim intFileNestedSystemListMemebers As Integer
Dim intFileMailboxAndNetworkNodeMembers As Integer
Dim intFileToWrite As Integer
Dim intCharacterPosition As Integer
Dim intField As Integer
Dim strRecord() As String
Dim strLineInput As String
Dim strLinePrint As String
Dim strListNumber As String
Dim strListName As String

'Open the input file
intFileInput = FreeFile
Open [i]cFileInputName[/i] For Input As #intFileInput

'Open the output file that will hold Nested System List Members information
intFileNestedSystemListMemebers = FreeFile
Open "[i]C:\NestedSystemListMemebers.txt[/i]" For Output As #intFileNestedSystemListMemebers

'Open the output file that will hold the Mailbox and Network Node information
intFileMailboxAndNetworkNodeMembers = FreeFile
Open "[i]C:\MailboxAndNetworkNodeMembers.txt[/i]" For Output As #intFileMailboxAndNetworkNodeMembers

'Loop through each line in the Input File and use the following matrix to determine
'what to do with the line
Do
  Line Input #intFileInput, strLineInput
  
  'Check for header information
  If Trim(strLineInput) = 0 Then
    blnWriteToFile = False
  ElseIf strLineInput Like "*" & cListNumberHeader & "*" Then
    'This line is a header so grab and store the List Number
    intCharacterPosition = InStr(strLineInput, cListNumberHeader) + Len(cListNumberHeader) + 1
    strListNumber = Trim(Mid(strLineInput, intCharacterPosition))
  ElseIf strLineInput Like "*" & cListNameHeader & "*" Then
    'This is a header so grab and store the List Name
    intCharacterPosition = InStr(strLineInput, cListNameHeader) + Len(cListNameHeader) + 1
    strListName = Trim(Mid(strLineInput, intCharacterPosition))
  ElseIf strLineInput Like "*" & cNestedSystemListMemebersHeader & "*" Then
    'Set the output file pointer for use later
    intFileToWrite = intFileNestedSystemListMemebers
    blnWriteToFile = False
  ElseIf strLineInput Like "*" & cMailboxAndNetworkNodeMembers & "*" Then
    'Set the output file pointer for use later
    intFileToWrite = intFileMailboxAndNetworkNodeMembers
    blnWriteToFile = False
  ElseIf strLineInput Like "*TOTAL ENTRIES ON LIST*" Then
    'Then end of a section has been reached
    blnWriteToFile = False
  ElseIf strLineInput Like "*----------------------------------*" Then
    'The next line should be data
    blnWriteToFile = True
  ElseIf blnWriteToFile Then
  'No other test was passed so write the data to one of the output files
    Select Case intFileToWrite
      Case intFileNestedSystemListMemebers
        'This Splits the current line to determine the fields
        strLinePrint = Chr(34) & strListNumber & Chr(34)
        strLinePrint = strLinePrint & "," & Chr(34) & strListName & Chr(34)
        strRecord = Split(strLineInput, " ")
        For intField = 0 To UBound(strRecord)
          'Since there is an unkown number of spaces, check if there is data
          'in the array field before writting
          If strRecord(intField) <> "" Then
            strLinePrint = strLinePrint & "," & Chr(34) & strRecord(intField) & Chr(34)
          End If
        Next intField
        'Actually write the file
        Print #intFileNestedSystemListMemebers, strLinePrint
      Case intFileMailboxAndNetworkNodeMembers
        'This uses String functions to break up the current line into fields, you may need to adjust
        'these to match the actual file format
        strLinePrint = Chr(34) & strListNumber & Chr(34)
        strLinePrint = strLinePrint & "," & Chr(34) & strListName & Chr(34)
        strLinePrint = strLinePrint & "," & Chr(34) & Trim(Mid(strLineInput, 5, 6)) & Chr(34)
        strLinePrint = strLinePrint & "," & Chr(34) & Trim(Mid(strLineInput, 11, 11)) & Chr(34)
        strLinePrint = strLinePrint & "," & Chr(34) & Trim(Mid(strLineInput, 23, 20)) & Chr(34)
        strLinePrint = strLinePrint & "," & Chr(34) & Trim(Mid(strLineInput, 44, 9)) & Chr(34)
        strLinePrint = strLinePrint & "," & Chr(34) & Trim(Mid(strLineInput, 54)) & Chr(34)
        Print #intFileMailboxAndNetworkNodeMembers, strLinePrint
    End Select
  End If
Loop Until EOF(intFileInput)
'Close all three files
Reset
[b]End Sub[/b]

Hope this helps,
CMP

[small]For the best results do what I'm thinking, not what I'm saying.[/small]
(GMT-07:00) Mountain Time (US & Canada)
 
Lakehouse,

You likely to want to do similar in the future? If so, I would consider buying a product call Monarch. It takes formatted data and returns flat files.

Craig
 
CautionMP - Thanks for the script, it works well. I can edit and tweak a few formatting issues and it will be ready to go.
Craig0201 - I am familiar with Monarch, unfortunately our IT dept does not support it, so I cant put on my PC.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top