Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations John Tel on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Download a large text file to Excel

Status
Not open for further replies.

DonyBoy

MIS
Apr 22, 2005
26
GB
I have a large text file to download in excel... impossible to download directly cos it's more than 65000+ lines. This file contains about 20 chunks of data..Ideally I would somehow like to download each chunk into sepate sheet in the same excel workbook.

Each of these chunk of data has a header and a last line called
EOD

Using this line 'EOD' as my divider..I'm hoping to split the file as I explained above and write each individual part sinto a sheet in Excel..

Has anyone done this before? got a macro for this for any chance?
 
How about:

interrogate the file system to find the file size
create a string variable of that size
open the file in binary
get the string variable from the file.
close the file

- this loads the file to memory pretty quickly

then use "split" on "EOD" to split it to the individual chunks (search this forum on "split" to find equivalent functions if you don't have this available)

- this splits the data to an array of long strings each of which contains one whole chunk of data

then create as many sheets as you need to accomodate the chunks

then, for each chunk

split the chunk into lines based on <crlf> or whatever your carriage return markers are for your data - this will give you an array of all the lines in the chunk

define a range of the right size to take all the lines in the array and assign the array to the range (you may need to transpose the array when you assign it)

job done
 



Hi,

Set up a TEXT Driver in your ODBC Drivers.

Query the text file via Data/Get External Data/New Database Query.... You can have the appropriate criteria in each query to return the subset of data you need.

Skip,

[glasses] [red][/red]
[tongue]
 
DonyBoy,
A couple of approaches. Be sure to update [tt]C:\MyFile.txt[/tt] to match your actual file name.
Code:
[navy]Public Sub[/navy] WriteByChunk()
[navy]On Error GoTo[/navy] WriteByChunk_Error
[navy]Const[/navy] cstMyFile [navy]As String[/navy] = "C:\MyFile.txt"
[navy]Dim[/navy] wksOutput [navy]As[/navy] Worksheet
[navy]Dim[/navy] lngMyRow [navy]As Long[/navy]
[navy]Dim[/navy] intMyChunk [navy]As[/navy] Integer, intMyFile [navy]As Integer[/navy]
[navy]Dim[/navy] strMyLine [navy]As String[/navy]

intMyChunk = 1
intMyFile = FreeFile

[navy]Open[/navy] cstMyFile [navy]For Input As[/navy] #intMyFile
[navy]Set[/navy] wksOutput = Worksheets.Add
wksOutput.Name = "Chunk" & intMyChunk
[navy]Do[/navy]
  [navy]Line Input[/navy] #intMyFile, strMyLine
  [navy]If[/navy] strMyLine = "EOD" [navy]Then[/navy]
    [green]'Create a new worksheet[/green]
    intMyChunk = intMyChunk + 1
    [navy]Set[/navy] wksOutput = Worksheets.Add
    wksOutput.Name = "Chunk" & intMyChunk
    lngMyRow = 1
  [navy]Else[/navy]
    [green]'write the line[/green]
    wksOutput.Cells(lngMyRow, 1) = strMyLine
    [green]'wksOutput.Cells(lngMyRow, 1).TextToColumns...[/green]
    lngMyRow = lngMyRow + 1
  [navy]End If[/navy]
[navy]Loop Until[/navy] EOF(intMyFile)

WriteByChunk_Exit:
[navy]Close[/navy] intMyFile
[navy]Set[/navy] wksOutput = [navy]Nothing[/navy]
[navy]Exit Sub[/navy]

WriteByChunk_Error:
[navy]Debug.Print[/navy] Now, "WriteByChunk", Err.Number, Err.Description
[navy]Stop[/navy]
[navy]Resume[/navy] WriteByChunk_Exit
[navy]End Sub[/navy]

Another approach, because you can never go wrong with three arrays and two loops?

Code:
[b]Option Base 1[/b] '[red]<-- Don't forget this[/red]

[navy]Public Sub[/navy] WriteByParse()
[navy]On Error GoTo[/navy] WriteByParse_Error
[green]'ASSumes comma delimited[/green]
[navy]Const[/navy] cstMyFile [navy]As String[/navy] = "C:\MyFile.txt"
[navy]Dim[/navy] wksOutput [navy]As[/navy] Worksheet
[navy]Dim[/navy] varMyFile() [navy]As[/navy] Variant, varMyChunk() [navy]As[/navy] Variant, varMyLine() [navy]As Variant[/navy]
[navy]Dim[/navy] lngMyChunk [navy]As[/navy] Long, lngMyLine [navy]As Long[/navy]
[navy]Dim[/navy] intMyFile [navy]As Integer[/navy]

[green]'get and spkit the file into chunks[/green]
intMyFile = FreeFile
[navy]Open[/navy] cstMyFile [navy]For Input As[/navy] #intMyFile
varMyFile = Split(Input(LOF(intMyFile), #intMyFile), "EOD")
[navy]Close[/navy] intMyFile

[green]'Cysle through the chunks[/green]
[navy]For[/navy] lngMyChunk = 1 [navy]To[/navy] [navy]UBound[/navy](varMyFile)
  [green]'Create the destination worksheet[/green]
  [navy]Set[/navy] wksOutput = Worksheets.Add
  wksOutput.Name = "Chunk" & lngMyChunk
  [green]'seperate the lines in this chunk[/green]
  varMyChunk = Split(varMyFile(lngMyChunk), vbCrLf)
  [green]'Process the lines in the chunk[/green]
  [navy]For[/navy] lngMyLine = 1 [navy]To[/navy] [navy]UBound[/navy](varMyChunk)
    [green]'Break up the line in 'fields'[/green]
    varMyLine = Split(varMyChunk, ",")
    wksOutput.Range(wksOutput.Cells(lngMyChunk, 1), _
    wksOutput.Cells(lngMyChunk, [navy]UBound[/navy](varMyLine))) = varMyLine
  [navy]Next[/navy] lngMyLine
  [navy]Set[/navy] wksOutput = [navy]Nothing[/navy]
[navy]Next[/navy] lngMyChunk

WriteByParse_Exit:
[navy]Exit Sub[/navy]

WriteByParse_Error:
[navy]Debug.Print[/navy] Now, "WriteByParse", Err.Number, Err.Description
[navy]Stop[/navy]
[navy]Resume[/navy] WriteByParse_Exit
[navy]End Sub[/navy]

Hope this helps,
CMP

[small]P.S. Does the MyThis and MyThat from the help file examples annoy everyone else as much as it does me? CMP[/small]
 
thanks very much. Not quite there yet.
I tried the first approach with one minor change to the code.

If strMyLine = "EOD" Then
'Create a new worksheet
intMyChunk = intMyChunk + 1
Set wksOutput = Worksheets.Add
wksOutput.Name = "Chunk" & intMyChunk
lngMyRow = 1
Else
lngMyRow = 1 '//// I have added this line, because 'lngMyRow is giving an error
'write the line
wksOutput.Cells(lngMyRow, 1) = strMyLine
'wksOutput.Cells(lngMyRow, 1).TextToColumns...
lngMyRow = lngMyRow + 1
End If


The code works but only with part of the data written on Cell "A1".

Here's a sample of the text file (only that the actuall data is 100s of lines)

@File_Version: 3
@Coordinate_Type_is: 3
@Export_Type_is: 1
@Number_of_Projects 1
@Project_Type_Name: 3 loy04_4d
@Project_Unit_is: meters
#File_Version____________-> 3
#Project_Name____________-> loy04_4d
#Project_Type____________-> 3
#Project_Unit____________-> meters
#OpenWorks_Project_______-> WESTOFSHETLAND
#Master_Project_______->
#Coordinate_type_________-> 3
#Number_of_points_in_hzd_-> 1
dcc_t31l_l08_base_t_med_5x5
#Horizon_internal_id_____->_1246
#Horizon_extremes_are____-> 2031.27014 2142.23169
#Horizon_onset_is__Minimum_____-> 1
#Horizon_type_is_Time______-> 1
#Horizon_color_is________-> 255 0 0
1018.0 12352.0 444593.78595 6695135.56350 2031.27014
1018.0 12353.0 444584.49664 6695127.19942 2031.27014
1018.0 12354.0 444575.20733 6695118.83533 2031.27014
1018.0 12355.0 444565.91801 6695110.47125 2031.27014
1018.0 12356.0 444556.62870 6695102.10717 2031.27014
1019.0 12352.0 444585.42181 6695144.85286 2031.27014
1019.0 12353.0 444576.13250 6695136.48878 2031.27014
1019.0 12354.0 444566.84319 6695128.12470 2031.27014
1019.0 12355.0 444557.55388 6695119.76061 2031.27014
1019.0 12356.0 444548.26457 6695111.39653 2031.27014
1020.0 12352.0 444577.05768 6695154.14223 2033.28113
1020.0 12353.0 444567.76837 6695145.77814 2033.28113
1020.0 12354.0 444558.47906 6695137.41406 2033.28113
1233.0 12328.0 443018.44065 6697333.51473 2128.84375
1233.0 12329.0 443009.15134 6697325.15064 2128.84375
1233.0 12330.0 442999.86202 6697316.78656 2128.84375
1233.0 12331.0 442990.57271 6697308.42248 2128.84375
1233.0 12332.0 442981.28340 6697300.05839 2128.84375
1234.0 12328.0 443010.07651 6697342.80409 2128.84375
1234.0 12329.0 443000.78720 6697334.44001 2128.84375
1234.0 12330.0 442991.49789 6697326.07593 2128.84375
1234.0 12331.0 442982.20858 6697317.71184 2128.84375
1234.0 12332.0 442972.91927 6697309.34776 2128.84375
1235.0 12328.0 443001.71238 6697352.09346 2128.84375
1235.0 12329.0 442992.42307 6697343.72937 2128.84375
1235.0 12330.0 442983.13376 6697335.36529 2128.84375
1235.0 12331.0 442973.84445 6697327.00121 2128.84375
1235.0 12332.0 442964.55513 6697318.63712 2128.84375
1236.0 12328.0 442993.34825 6697361.38282 2128.84375
1236.0 12329.0 442984.05893 6697353.01874 2128.84375
1236.0 12330.0 442974.76962 6697344.65465 2128.84375
1236.0 12331.0 442965.48031 6697336.29057 2128.84375
1236.0 12332.0 442956.19100 6697327.92649 2128.84375
EOD
 
Also just to add...

1) The above format of the text file is repeated again and again...

2) The code is creating new sheets Chunk1, Chunk2....correctly.

what appears in A1 is the following:
1236.0 12332.0 442956.19100 6697327.92649 2128.84375
all in one cell...

This all that is there in teh entire sheet.

What I would like is that table tab, delimitted..
and spread out in the entire sheet (both the headers and the data. EOD optional)
 
Your change is the reason it is only writing to the first row. You are incrementing IngMyRow, but your change is resetting it to 1 every time, so the increment makes no difference, therefore you always write to the first row, which is why you only see the last line of data, as it overwrites all the rest.

To spread the data across more than 1 column, you need to uncomment the line:

'wksOutput.Cells(lngMyRow, 1).TextToColumns...

and complete it according to whatever you want to use to delimit the data (tab in this case?)
 
DonyBoy,
My bad. You were on the right path, I forgot to initialize [tt]lngMyRow[/tt] for the first go round.
Code:
Public Sub WriteByChunk()
...
intMyChunk = 1
[b]lngMyRow = 1[/b]
intMyFile = FreeFile

Open cstMyFile For Input As #intMyFile
Set ...

Hope this helps,
CMP

[small]For the best results do what I'm thinking, not what I'm saying.[/small]
(GMT-07:00) Mountain Time (US & Canada)
 
it works (almost)...but when it comes to creating the last spreadsheet i think...it doesn't like the following line

wksOutput.Cells(lngMyRow, 1).TextToColumns

It's giving an eoor...no dat aselected to parse..or somehting like that..

my code at present looks like this..

there is an "EOD" at the very end of my data also.



Public Sub WriteByChunk()
On Error GoTo WriteByChunk_Error
Const cstMyFile As String = "Q:\BP\Loyal prospect development\Geophysics\tks data\For Wesley\test.txt"


Dim wksOutput As Worksheet
Dim lngMyRow As Long
Dim intMyChunk As Integer, intMyFile As Integer
Dim strMyLine As String
intMyChunk = 1
lngMyRow = 1

intMyFile = FreeFile

Open cstMyFile For Input As #intMyFile
Set wksOutput = Worksheets.Add
wksOutput.Name = "Chunk" & intMyChunk
Do
Line Input #intMyFile, strMyLine
If strMyLine = "EOD" Then
'Create a new worksheet
intMyChunk = intMyChunk + 1
Set wksOutput = Worksheets.Add
wksOutput.Name = "Chunk" & intMyChunk
lngMyRow = 1
Else
'lngMyRow = 1 '//// I have added this line, because lngMyRow is giving an error
'write the line
wksOutput.Cells(lngMyRow, 1) = strMyLine
wksOutput.Cells(lngMyRow, 1).TextToColumns
lngMyRow = lngMyRow + 1
End If
Loop Until EOF(intMyFile)

WriteByChunk_Exit:
Close intMyFile
Set wksOutput = Nothing
Exit Sub

WriteByChunk_Error:
Debug.Print Now, "WriteByChunk", Err.Number, Err.Description
Stop
Resume WriteByChunk_Exit
End Sub



 
Replace this:
Else
With this:
ElseIf Trim(strMyLine) <> "" Then

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top