NickFort
Technical User
- Jun 10, 2010
- 113
Hi all,
I'd like to be able to read a header line of a CSV as a character type, and then count the number of commas there (i.e. to determine the number of columns), without having to assign a fixed length to the character.
The way I'd like to do it is to read the first line character by character, each time search for a match between the currently read character and the delimiter, to count the number of delimiters, and stop at the end of the first line. Unfortunately, Fortran reads vertically with each successive read call, not horizontally.
One option is to throw an "overkill" character variable at the read, of length greater than the maximum number of characters in the header; the problem with this is that it's not maintainable. I can't guarantee what the longest line could possibly be, and 99% of the time, it'll be a waste of memory.
To illustrate what I mean, say I want to read a CSV containing:
One option is to do the following (which doesn't do the counting of commas yet):
So here, I assign a length to "current" which is greater than the number of characters in the the header line. I could search through the "current" string to count the number of delimiters, but that's not optimal, in my opinion, because I've assigned way more memory to it than I need with the file as it currently is, but what if columns are added, and the header line exceeds 1000 characters?
Is there a more efficient way of doing something like this?
--------------------------------------
Background: Chemical engineer, familiar mostly with MATLAB, but now branching out into real programming.
I'd like to be able to read a header line of a CSV as a character type, and then count the number of commas there (i.e. to determine the number of columns), without having to assign a fixed length to the character.
The way I'd like to do it is to read the first line character by character, each time search for a match between the currently read character and the delimiter, to count the number of delimiters, and stop at the end of the first line. Unfortunately, Fortran reads vertically with each successive read call, not horizontally.
One option is to throw an "overkill" character variable at the read, of length greater than the maximum number of characters in the header; the problem with this is that it's not maintainable. I can't guarantee what the longest line could possibly be, and 99% of the time, it'll be a waste of memory.
To illustrate what I mean, say I want to read a CSV containing:
Code:
HEAD1,HEAD2,HEAD3,HEAD4,HEAD5,HEAD6
3,TEXT,2.3452,6,1.2246,7.64E+12
6,TEXT2,32.235247,12.2,3467.12,1.21E-05
One option is to do the following (which doesn't do the counting of commas yet):
Code:
program test_read_line
implicit none
character(len=1000) :: current
character :: delimiter
integer :: read_status
open (unit=11, file="mydata.csv", action="read", status="old")
read (11,'(A1000)',iostat=read_status) current
print *, "read_status: ", read_status
print *, "current: ", trim(adjustl(current))
close (11)
end program test_read_line
So here, I assign a length to "current" which is greater than the number of characters in the the header line. I could search through the "current" string to count the number of delimiters, but that's not optimal, in my opinion, because I've assigned way more memory to it than I need with the file as it currently is, but what if columns are added, and the header line exceeds 1000 characters?
Is there a more efficient way of doing something like this?
--------------------------------------
Background: Chemical engineer, familiar mostly with MATLAB, but now branching out into real programming.