Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Break down large text file into many smaller 1

Status
Not open for further replies.

03Explorer

Technical User
Sep 13, 2005
304
US
Can someone help or direct me to assistance to take an EXTREMELY large comma delimited file of over 650,000 rows of data into (x) number of files (001,002,003,004, ... , xxx) with a set number of rows (ideally 60,0000 with the header as row 1).

This data will eventually be reviewed by Excel and the Large file is beyond the ability to be imported into Excel. Access does not like to import these files as well, not sure why. So that leaves me with breaking the monster into many more manageable files.

VBS or equivalent scripting would be my preference.

Thanks
 
This may not be the best solution...but might be worth a try....

Code:
	Const ForReading = 1
	Const ForWriting = 2
	Const ForAppending = 8
	
	Dim objFSO : Set objFSO = CreateObject("Scripting.FileSystemObject")
	Dim strSourceFile : strSourceFile = "C:\temp\test.txt"
	Dim objSourceFile : Set objSourceFile = objFSO.OpenTextFile(strSourceFile, ForReading)
	Dim strHeader : strHeader = objSourceFile.ReadLine
	Dim strDestFolderPath : strDestFolderPath = "C:\temp\"
	Dim intCount : intCount = 1
	Dim strOutputFileName : strOutputFileName = "Test_" & Right("0000" & intCount, 4) & ".txt"
	Dim objOutputFile : Set objOutputFile = objFSO.OpenTextFile(strDestFolderPath & strOutputFileName, ForWriting, True)
	objOutputFile.WriteLine strHeader
	Do Until objSourceFile.AtEndOfStream
		If objSourceFile.Line Mod 60000 = 0 Then
			objOutputFile.Close
			intCount = intCount + 1
			strOutputFileName = "Test_" & Right("0000" & intCount, 4) & ".txt"
			Set objOutputFile = objFSO.OpenTextFile(strDestFolderPath & strOutputFileName, ForWriting, True)
			objOutputFile.WriteLine strHeader
		End If
		objOutputFile.WriteLine objSourceFile.ReadLine
	Loop

--------------------------------------------------------------------------------
dm4ever
My philosophy: K.I.S.S - Keep It Simple Stupid
 
Using the Excel object it can be done directly...but I have no need for it so....

--------------------------------------------------------------------------------
dm4ever
My philosophy: K.I.S.S - Keep It Simple Stupid
 
What have YOU tried so far and where in YOUR code are you stuck ?

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
I have modified the code from above to work as I expected. Just some minor adjustments, making my life easier to modify and train others to update.

Code:
DIM Lines
DIM SourceFileName
DIM SourceLocation
DIM Destination
DIM FilePrefix
DIM FileSuffix

Lines = 60000
SourceFileName = "IVR calls -2-12 Detail-RW.csv"
SourceLocation = "I:\Reporting\2008-02-22 Request IVR 2-12 with ANI\"
Destination = "C:\temp\"
FilePrefix = "TestRW"
FileSuffix = "txT"

    Const ForReading = 1
    Const ForWriting = 2
    Const ForAppending = 8
    
    Dim objFSO : Set objFSO = CreateObject("Scripting.FileSystemObject")
    Dim strSourceFile : strSourceFile = SourceLocation & SourceFileName
    Dim objSourceFile : Set objSourceFile = objFSO.OpenTextFile(strSourceFile, ForReading)
    Dim strHeader : strHeader = objSourceFile.ReadLine
    Dim strDestFolderPath : strDestFolderPath = Destination
    Dim intCount : intCount = 1
    Dim strOutputFileName : strOutputFileName = FilePrefix & "_" & Right("0000" & intCount, 4) & "." & FileSuffix
    Dim objOutputFile : Set objOutputFile = objFSO.OpenTextFile(strDestFolderPath & strOutputFileName, ForWriting, True)
    objOutputFile.WriteLine strHeader
    Do Until objSourceFile.AtEndOfStream
        If objSourceFile.Line Mod Lines = 0 Then
            objOutputFile.Close
            intCount = intCount + 1
            strOutputFileName = FilePrefix & "_" & Right("0000" & intCount, 4) & "." & FileSuffix
            Set objOutputFile = objFSO.OpenTextFile(strDestFolderPath & strOutputFileName, ForWriting, True)
            objOutputFile.WriteLine strHeader
        End If
        objOutputFile.WriteLine objSourceFile.ReadLine
    Loop
msgbox "Processing File '" & SourceFileName & "' Completed!" , 0+64+262144,"File Multiplier"

My output generates x files with the name (based on code above) TestRW_001.txT, TestRW_002.txT, ... TestRW_00x.txT. Reason for odd file names is for testing purposes. I made them into variables for ease of updating all important values. txT can be renamed for csv for ease of opening in Excel if your data is comma delimited as mine is generated.


I really like the concept that I am not relying on Excel to do the file break multiplying (lack of a better name).

Thoughts?
 
I'm glad to see it helped get you closer to what you needed.

--------------------------------------------------------------------------------
dm4ever
My philosophy: K.I.S.S - Keep It Simple Stupid
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top