Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Massive File Search

Status
Not open for further replies.

hilbertl3

Technical User
Sep 22, 2004
77
0
0
US
I have been tasked with collecting the file path, file name, file create date, file last modified date and file size for about 5 million files in several large directories on our corporate network.
I have tried using the Application.Filesearch method but the program crashes after several hours due to memory overload.
I am now using code that doesn't try to load the details of all the files into memory before putting them in a table, but adds the details of each file to a table as it is found. However, doing it this way I am processing about 185 files per minute. At that rate, it will take about 450 hours or 19 days to complete. To run the process I am using a Xenon workstation with two 3 Ghz Xenon chips and 3 GB of memory.
Does anyone know a better way of doing this?

Here is the code I am currently using:




Option Compare Database
Option Explicit

Public Sub Walk_Test()

Dim k3 As Recordset

Dim var1 As Variant
Dim mstring As String
Dim ct As Integer
DoCmd.SetWarnings False
'On Error Resume Next

Set k3 = CurrentDb.OpenRecordset("select * from FilePaths_ order by filepaths_;")

k3.MoveLast
k3.MoveFirst

For ct = 1 To k3.RecordCount

WalkDirTree k3!FilePaths_, "*.*"
k3.MoveNext
Next





End Sub

Private Sub WalkDirTree(TopDir As String, FileExt As String)

Dim FSO As FileSystemObject
Dim FileList As New Collection
Dim Idx As Integer

Dim var1 As Variant
Dim mstring As String
Dim ct As Integer
DoCmd.SetWarnings False
'On Error Resume Next




Set FSO = CreateObject("Scripting.FileSystemObject")
ProcessFolder FSO.GetFolder(TopDir), FileList, FileExt
Set FSO = Nothing






End Sub

Private Sub ProcessFolder(FolderName As Scripting.Folder, FileList As Collection, FileExt As String)

Dim SubFolders As Scripting.Folders
Dim FileNames As Scripting.Files
Dim SubFolderName As Scripting.Folder
Dim FileID As Scripting.File
Dim k4 As Recordset
Set k4 = CurrentDb.OpenRecordset("MainCollection_")
Set SubFolders = FolderName.SubFolders
For Each SubFolderName In SubFolders
'Debug.Print SubFolderName
ProcessFolder SubFolderName, FileList, FileExt
Next

Set FileNames = FolderName.Files
For Each FileID In FileNames
'If (UCase(Right(FileID.Path, Len(FileExt))) = UCase(FileExt)) Then
k4.AddNew
k4!File_Path_Name = FileID.Path
On Error Resume Next
k4!file_Name = FileID.Name
k4!LastAccessDate = FileID.DateLastAccessed
k4!LastModifiedDate = FileID.DateLastModified
k4!CreateDate = FileID.DateCreated
k4!size_ = FileID.Size
k4.Update
'On Error Resume Next
' End If
Next

Set FileID = Nothing
Set SubFolderName = Nothing
Set FileNames = Nothing
Set SubFolders = Nothing

End Sub

 
I think you have a few items there that could be optimized, but at some effort.

This looks like Access VBA. VB6 compiled to native code (a.) may be faster and (b.) certainly could be profiled to track down bottlenecks.

You're also using the FSO to traverse the directory structures. I don't have a figures handy, but rumor suggests that this can be impractically slow for large-scale efforts like yours. I'd recommend using VB6's in-built directory/file search verbs instead or preferably direct Windows API calls.

Barring that you might look at FSO alternatives like:

File Search and File Property API Classes

I can't vouch for the quality or speed of those classes, they're just one of the first hits I got Googling. However they may well be usable from Access VBA and quick enough to assist you in your mission. If not, perhaps there are others... or if you're a VB6 guy you might create your own classes calling the Windows API that can be used from VBA.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top