With the help of several posts on this site I have put together the code below.
The code extracts text from documents and places it along with the file name and file path into an Access table. The code works fine for Folders and Subfolders.
Currently the code pulls the path to the folders from a text field on a form (FolderList). This code processes thousands of documents very well. However it does take about 10 minutes for every 150 files.
I have found that there are many times when I need to process individual files or a small number of files that may not all be in the same folder or subfolder.
I would like to add another text field to the form for an individual file or files. What I need help with is altering the code to process individual files or a string of individual files.
Thank you for your attention to this matter.
accel45
On Error GoTo ErrHandler
Me.txtBegin = Now()
Set Fs = Application.FileSearch
Dim strFileSpec As String
Dim strFolders As String
strFolders = Me.FolderList
strFileSpec = "*.doc;*.txt;*.rtf"
With Fs
.NewSearch
.LookIn = strFolders 'Path
'.filename = "*.doc" 'Search through all word document
.FileName = strFileSpec
.SearchSubFolders = Me.txtSubFolder
.TextOrProperty = "*" 'Find Word doc(s) contatining this text
If .Execute > 0 Then
If MsgBox("There were " & .FoundFiles.Count & " " & "Files found in the " & .LookIn & " Folder" & vbCrLf & _
"Do you want to continue?", vbYesNo + vbQuestion, "Found Files") = vbYes Then
For i = 1 To .FoundFiles.Count 'enumerate through found files
strItems = .FoundFiles(i) 'put file name in Variable
Debug.Print strItems
'do something to the files
SysCmd acSysCmdSetStatus, "Loading Word documents"
Set wrd = New Word.Application
Set db = CurrentDb()
Set rst = db.OpenRecordset("tblHarvest")
strDocName = Mid$(strItems, InStrRev(strItems, "\") + 1)
strDocLocation = strItems
Set wdoc = wrd.Documents.Open(FileName:=strItems, _
ReadOnly:=True, AddToRecentFiles:=False, _
Format:=wdOpenFormatAuto, Visible:=False)
strDocText = wdoc.Content.FormattedText
rst.AddNew
rst.Fields("Document") = strDocName
rst.Fields("Description") = strDocText
rst.Fields("FileLocation") = strDocLocation
rst.Fields("EDate") = Date
rst.Update
wrd.Quit
Next
'wrd.Quit
Set wrd = Nothing
SysCmd acSysCmdClearStatus
End If
Else
MsgBox "No Files Found"
End If
End With
Me.txtEnd = Now()
MsgBox "Ready To Update List."
ExitHere:
Exit Sub
ErrHandler:
Debug.Print Err, Err.Description
Resume ExitHere
End Sub
The code extracts text from documents and places it along with the file name and file path into an Access table. The code works fine for Folders and Subfolders.
Currently the code pulls the path to the folders from a text field on a form (FolderList). This code processes thousands of documents very well. However it does take about 10 minutes for every 150 files.
I have found that there are many times when I need to process individual files or a small number of files that may not all be in the same folder or subfolder.
I would like to add another text field to the form for an individual file or files. What I need help with is altering the code to process individual files or a string of individual files.
Thank you for your attention to this matter.
accel45
On Error GoTo ErrHandler
Me.txtBegin = Now()
Set Fs = Application.FileSearch
Dim strFileSpec As String
Dim strFolders As String
strFolders = Me.FolderList
strFileSpec = "*.doc;*.txt;*.rtf"
With Fs
.NewSearch
.LookIn = strFolders 'Path
'.filename = "*.doc" 'Search through all word document
.FileName = strFileSpec
.SearchSubFolders = Me.txtSubFolder
.TextOrProperty = "*" 'Find Word doc(s) contatining this text
If .Execute > 0 Then
If MsgBox("There were " & .FoundFiles.Count & " " & "Files found in the " & .LookIn & " Folder" & vbCrLf & _
"Do you want to continue?", vbYesNo + vbQuestion, "Found Files") = vbYes Then
For i = 1 To .FoundFiles.Count 'enumerate through found files
strItems = .FoundFiles(i) 'put file name in Variable
Debug.Print strItems
'do something to the files
SysCmd acSysCmdSetStatus, "Loading Word documents"
Set wrd = New Word.Application
Set db = CurrentDb()
Set rst = db.OpenRecordset("tblHarvest")
strDocName = Mid$(strItems, InStrRev(strItems, "\") + 1)
strDocLocation = strItems
Set wdoc = wrd.Documents.Open(FileName:=strItems, _
ReadOnly:=True, AddToRecentFiles:=False, _
Format:=wdOpenFormatAuto, Visible:=False)
strDocText = wdoc.Content.FormattedText
rst.AddNew
rst.Fields("Document") = strDocName
rst.Fields("Description") = strDocText
rst.Fields("FileLocation") = strDocLocation
rst.Fields("EDate") = Date
rst.Update
wrd.Quit
Next
'wrd.Quit
Set wrd = Nothing
SysCmd acSysCmdClearStatus
End If
Else
MsgBox "No Files Found"
End If
End With
Me.txtEnd = Now()
MsgBox "Ready To Update List."
ExitHere:
Exit Sub
ErrHandler:
Debug.Print Err, Err.Description
Resume ExitHere
End Sub