Hi. I have a text file that I need help processing faster. I running XP Pro on a 3GHz processor, 1GB RAM, and a 80GB hard drive.
The script I'm using is based from the Hey Scripting Guy Archive, except I modified it a little:
Basically, the script reads each line from the SourceFile.txt. When it comes to the word "oranges" it reads every character it sees before "oranges" until it gets to the word "apples ". So if the line reads "I like apples better than oranges", the script will return "better than ".
After the script reads all the values from SourceFile.txt, it copies those values on individual lines to TargerFile.txt. I use the CountFile.txt simply to know how many values have been processed. As each value is obtained, the strIncrement value is increased by 1 and is written to CountFile.txt
The SourceFile contains over 1,000,000 lines! Yes, that's 1 million. When the script is first ran, it processed around 4000 lines per second. However, as time goes on, fewer lines are processed per second. 100,000 lines, less than 50 lines were processed per second. To give the script maximum resources, I manually set the wscript.exe process in the Task Manager to RealTime, with the Affinity set to both CPUs checked.
The script has been running for over 24 hours with no sign of completing. Any ideas on how to process this faster? Thanks.
The script I'm using is based from the Hey Scripting Guy Archive, except I modified it a little:
Basically, the script reads each line from the SourceFile.txt. When it comes to the word "oranges" it reads every character it sees before "oranges" until it gets to the word "apples ". So if the line reads "I like apples better than oranges", the script will return "better than ".
After the script reads all the values from SourceFile.txt, it copies those values on individual lines to TargerFile.txt. I use the CountFile.txt simply to know how many values have been processed. As each value is obtained, the strIncrement value is increased by 1 and is written to CountFile.txt
The SourceFile contains over 1,000,000 lines! Yes, that's 1 million. When the script is first ran, it processed around 4000 lines per second. However, as time goes on, fewer lines are processed per second. 100,000 lines, less than 50 lines were processed per second. To give the script maximum resources, I manually set the wscript.exe process in the Task Manager to RealTime, with the Affinity set to both CPUs checked.
The script has been running for over 24 hours with no sign of completing. Any ideas on how to process this faster? Thanks.
Code:
Const ForReading = 1
Const ForWriting = 2
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile("C:\SourceFile.txt", ForReading)
'===========
'Contents for CountFile
Set objFSO2 = CreateObject("Scripting.FileSystemObject")
Set objFile2 = objFSO2.OpenTextFile("C:\CountFile.txt", ForWriting)
strIncrement = 0
'===========
Do Until objFile.AtEndOfStream
'strData = ""
strSearchString = objFile.ReadLine
intStart = InStr(strSearchString, "apples ")
If intStart <> 0 Then
intStart = intStart + 3
strText = Mid(strSearchString, intStart, 250)
For i = 1 to Len(strText)
If Mid(strText, i, 1) = " oranges" Then
'places each entry on separate line
strData = strData & vbCrLf
strIncrement = strIncrement + 1
objFile2.WriteLine strIncrement
Exit For
Else
strData = strData & Mid(strText, i, 1)
End If
Next
End If
Loop
objFile.Close
Set objFile = objFSO.OpenTextFile("C:\TargetFile.txt", ForWriting)
objFile.WriteLine strData
objFile.Close
objFile2.Close