Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Excel csv w/o row terminators

Status
Not open for further replies.

idbr

MIS
May 1, 2003
247
GB
Hi,

I'm trying to process a csv export file but have a bit of a problem. It's been exported in Excel csv format. When I view the file in Excel, it splits into its constituent columns just fine. However, when I look at the file using notepad there dont appear to any row terminators at all.

Reformatting the files as tab delimited makes the problem go away - but I've got many files > 65536 rows (the Excel maximum for 2003, which I'm working on).

Excel must have some way of determining the file structure, does anyone know hat it is?

Thanks, Iain

Example code:

Code:
    ' open the error file
    Open strFolderPath & fl.Name For Input As #1
    
    ' reset the counter
    i = 0
    
    ' loop to the end of the file counting rows
    Do Until EOF(1)
        
        If i = 0 Then
        
            ' output the header row
            [COLOR=red]this fails because the entire file is read in due to lack of row terminators[/color]
            Line Input #1, strTest
            Debug.Print fl.Name & ": " & strTest
            
        Else
            
            ' just count the row
            Input #1, strTest
            
        End If
        
        i = i + 1
        
    Loop
    
    ' output the final result
    Debug.Print fl.Name & " = " & i
    
    Close #1
 
Actually, also worth pointing out that because of the lack of row terminators the file row count is hopelessly out too.

Ta, Iain
 
Fixed it with some help from this thread.

Slooooow running code below, suggestions for improvements welcome.

Code:
intInFileID = FreeFile
Open strFile For Binary As #intInFileID

intOutFileID = FreeFile
Open strFile & ".txt" For Binary As #intOutFileID

Do Until EOF(intInFileID)

    Get #intInFileID, , strChar
    
    If strChar = vbLf Then
    
        Put #intOutFileID, , vbCrLf
        
    Else
    
        Put #intOutFileID, , strChar
        
    End If
    
Loop

Close #intOutFileID
Close #intInFileID
 
Not massively tested the output and no error handling included but this should be faster than sloooow for you (depending on file size, tested against 3Mb csv and runs in less than half a second on my machine):
Code:
Sub ChangeFileData(startfile As String, outfile As String)
Dim indata() As Byte
Dim flen As Long
Dim f As Integer
Dim strTarget As String

Open startfile For Binary Access Read As #1
flen = FileLen(startfile)
ReDim indata(flen)
Get #1, , indata
Close #1

strTarget = Replace(StrConv(indata, vbUnicode), vbLf, vbCrLf)

Debug.Print Len(strTarget)

Open outfile For Output As #1

Print #1, strTarget

Close #1
End Sub
Hope this helps

Andy
---------------------------------
Zebracorn: 50% Zebra, 50% Unicorn = 100% Real.

 
Hi Iain'
When I view the file in Excel, it splits into its constituent columns just fine. However, when I look at the file using notepad there dont appear to any row terminators at all.
Regardless of what Notepad shows, the rows must have something that Excel recognises as a row terminator - most likely ASCII 13, which is a carriage return character Notepad won't process. If so, opening the file in Word will reveal ¶ characters at the end of each line but, although these look like paragraph breaks, they aren't; a paragraph break in Word consists of an ASCII 13 & 10 pair (carriage-return & line feed).


Cheers
[MS MVP - Word]
 
Following the thread I agree with Andy.

Couple of thoughts for speed.[ol][li]I beleive that [tt]Scripting.TextStream.ReadLine[/tt] understands Carriage Return (Cr), Line Feed (Lf), and CrLf terminators for rows and works much the same as VBA.FileSystem operations.[/li][li]Use ADO. You can query a text file directly and it does understand the three different types of line terminators (plus you have a little more contol since ADO will parse the lines (row) into fields.[/li][/ol]

Hope this makes sense and helps,
CMP

[small]For the best results do what I'm thinking, not what I'm saying.[/small]
(GMT-07:00) Mountain Time (US & Canada)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top