Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

how to read 5Mb text file 2

Status
Not open for further replies.

bvahan5

Programmer
Jun 11, 2005
63
RU
i do use Scripting.FileSystemObject
it works OK for small txt files

i can't use it with TXT files like 5Mb

any way of buffering?
what is a limit in Kb of txt file for FSO module?

thanks for advices
 
Try looking at this ( and see if it can help. Specifically, look at the section entitled: "What you can't do" (about 2/3 down the page).

------------------------------------------------------------------------------------------------------------------------
"Men occasionally stumble over the truth, but most of them pick themselves up and hurry off as if nothing ever happened."
- Winston Churchill
 
What do you need to do with the text file?

I just ran into an issue where I had to scrub an 18MB text file (comma delimited) looking for a client name.

It was impossibly slow in pure ASP, and I was saying to myself "What I really need is a GREP statement".

Well, that's a *nix command, not a windows/asp command... *BUT* I did manage to find a DOS port of the GREP command. :D

SO..... here's the code.... and how it makes a temporary file from the GREP-ped results ...

Code:
Set oShell = Server.CreateObject("Wscript.Shell")

tmpFile = "$grep" & Int(rnd(1)*5000) & ".txt"
strCmd = "%ComSpec% /c grep -e """ & strGrep & """ C:\TEMP\INDEXFILE.CSV > c:\temp\" & tmpFile

oShell.run strCmd, 1, TRUE

Set filesys = CreateObject("Scripting.FileSystemObject")

Set grepFile = filesys.OpenTextFile("c:\temp\" & tmpFile)
'   .... read the file, split it, get what I need....

grepFile.Close
filesys.DeleteFile("C:\temp\" & tmpFile)
set oShell = Nothing

.... then my grepFile has the lines "found" by the grep.

Sifting through that 18 MB file was taking about 4 1/2 minutes in ASP. Using this method, I find the lines that I want in that 18 MB file in under 1/2 second.



Just my 2¢

"In order to start solving a problem, one must first identify its owner." --Me
--Greg
 
That is some nifty work! I might have a need for this in the future so a star for some work saving... Thanks! [thumbsup]

------------------------------------------------------------------------------------------------------------------------
"Men occasionally stumble over the truth, but most of them pick themselves up and hurry off as if nothing ever happened."
- Winston Churchill
 
I enjoy a good Grep as much as the next guy, but I would point out that windows does have the find command that will list the lines in a file that contain a specified string.

[red]"... isn't sanity really just a one trick pony anyway?! I mean, all you get is one trick, rational thinking, but when you are good and crazy, oooh, oooh, oooh, the sky is the limit!" - The Tick[/red]
 
Ok, you guys made me check. I could have sworn I remembered handling in excess of 80MB CSV files with straight VBScript and FSO's, so I just tried one. Basically I opened the file, ran a replace, and wrote it out to a second file. The file consisted of 300,000 lines of:
"This is a line, this is another line, this is a third line, blah blah blah"

and it took 4.75 seconds to process the 23MB file and create a second 23MB file witha Replace on each line run to replace "this" with "that". It took 4.125 seconds the second time, maybe I should hook up the Benchmark script.


The key is to use the ReadLine capability of the file object rather than trying to read it all in one big chunk. Just setup a loop until the AtEndOfFile property is true and zip through it one line at a time without storing that line locally. It's when you start dealing with trying to read the whole file or keep an array of the entire file contents in memory that things get ugly.

-T

signature.png
 
Code:
Option Explicit


Dim fso, fnfo, fil_1, fil_2, str_line

Dim search_for : search_for = "NetDDE"
Dim path_1, path_2
path_1 = "C:\Programming\VBScript\TestCsv.csv"
path_2 = "C:\Programming\VBScript\FoundCsv.csv"

Set fso = CreateObject("Scripting.FileSystemObject")
Set fnfo = fso.GetFile(path_1)

WScript.StdOut.WriteLine "Original File: " & Round(fnfo.Size/1024/1024,2) & " MB"

Set fil_1 = fso.OpenTextFile(path_1,1)
Set fil_2 = fso.CreateTextFile(path_2,2,true)

Dim start_time : start_time = timer

Do Until fil_1.AtEndOfStream
	str_line = fil_1.ReadLine()
	If InStr(str_line,search_for) Then fil_2.WriteLine(str_line)
Loop

fil_1.Close
fil_2.Close
Set fil_1 = Nothing
Set fil_2 = Nothing

WScript.StdOut.WriteLine "Time Elapsed: " & Round((timer - start_time) * 1000) & "ms"

Set fnfo = fso.GetFile(path_2)
WScript.StdOut.WriteLine "New File is: " &  Round(fnfo.Size/1024/1024,2) & " MB"

Set fnfo = Nothing
Set fso = Nothing

Ran this on a file that had 60,000 lines of service information in CSV format. The resulting file had 1362 rows and the output from 4 runs was:
Code:
Original File: 19.94 MB
Time Elapsed: 1344ms
New File is: 0.97 MB

Original File: 19.94 MB
Time Elapsed: 1473ms
New File is: 0.97 MB

Original File: 19.94 MB
Time Elapsed: 1375ms
New File is: 0.97 MB

Original File: 19.94 MB
Time Elapsed: 1363ms
New File is: 0.97 MB

Granted I am running this from the command-line, but the time to search a text file should be measurable in seconds and milliseconds, not minutes. The above example was averaging less than a second and a half.

Not quite as fast as doing a grep ahead of time, but also not reliant on tools that may or may not be accessible on a hosting server.

-T

signature.png
 
You know, when I *asked* for a solution to my problem, nobody had an answer... that's why I ended up writing it with an external GREP command.

Probably because I like *nix more than *doze.....



Just my 2¢

"In order to start solving a problem, one must first identify its owner." --Me
--Greg
 
Yeah, I don't even bother with grep on windows, I have *nix machine sitting next to my windows machine at work and at home :)

signature.png
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top