Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Read a .csv file in Cobol

Status
Not open for further replies.

DaanDL

Programmer
Apr 22, 2008
3
BE
Hi all,

I've been looking for hours on the net but found nothing really suitable.

How can I read a csv file, containing records with structure:
employeeID;type;typeID;startDate;endDate

into this:
FD CsvFile.
01 CsvRecord.

88 EndOfCsvFile VALUE HIGH-VALUES.

03 employeeID pic 9(2).
03 type pic x(4).
03 typeID pic 9(2).
03 startDate pic x(20).
03 endDate pic x(20).


A record in the csv file could look like this :
01;acti;01;2008-04-15 01:42:03;2008-04-15 01:42:03

The problem is that, there are multiple records in the csv file and they are separated with a newline.
How can I program this in cobol, so everytime, the program encounters a new line (int the csv), it starts with creating a new record?

Greetz Daan

 
Have a look at the UNSTRING instruction.

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Thank you for the quick reply.

Unstring did the trick, but now I have another problem :)
The program reads the last record twice and the two first numbers of that last(double) record are different!
Strange...

So my .csv has two records:

01;acti;01;2008-04-15 01:42:03;2008-04-15 01:42:03
02;acti;02;2008-04-15 01:42:03;2008-04-15 01:42:03

And in my program, I get this output:

01;acti;01;2008-04-15 01:42:03;2008-04-15 01:42:03
02;acti;02;2008-04-15 01:42:03;2008-04-15 01:42:03
78;acti;02;2008-04-15 01:42:03;2008-04-15 01:42:03

(mind the last record starting with 78)

How is this possible? And why are those 2 first numbers specifically 78?

And here's some code to read the records:

OPEN INPUT OphalenBestand.

READ OphalenBestand
AT END
SET EOF TO TRUE
END-READ

PERFORM UNTIL EOF

Unstring ophalenRegel delimited by ";"
into o-medewerkerId, o-type, o-typeId, o-typeStart,
o-typeEind

Display o-medewerkerId + o-type + o-typeId + o-typeStart
+ o-typeEind

READ OphalenBestand
AT END SET EOFTO TRUE
END-READ

END-PERFORM
CLOSE OphalenBestand

Greetz Daan and thanks in advance
 
You may try this:
Code:
OPEN INPUT OphalenBestand
PERFORM UNTIL EOF
    READ OphalenBestand AT END
        SET EOF TO TRUE
        EXIT PERFORM
    END-READ
*> your stuff here
END-PERFORM
CLOSE OphalenBestand

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
The code you have provided doesn't replicate your outcome.

If I use the exact code & data you have provided, the output is correct ie.

01acti012008-04-15 01:42:032008-04-15 01:42:03
02acti022008-04-15 01:42:032008-04-15 01:42:03

(I'm not sure how you land up with the delimiter in your display output either)

the only thing i can think of is that the file you are using does not contain records of 50 character length
 
A technical point: the described input file is [not a .csv file, the fields are separated by semicolons, not commas.
 
One other thing you should be aware of. In IBM mainframe, you cannot validly reference the FD after AT END has been reached. You have an 88 condition you are setting when AT END is reached and you are also using that in your PERFORM UNTIL. The IBM manual says the results are unpredictable. You may want to check if that is valid to use in your system.
 
coboldeke is quite right. One place I worked, by boss wrote a small program which referenced the FD after the eof. I told him the code was unnessesary and perilous. He said "It works!" But with a change to the os, it didn't.
 
Thanks again for responding.

@PHV:
I tried your code, but unfortunately, the same result :(

@Webrabbit:
You're right, but I'm using Microsoft Excel 2003 to make the .csv file, and it alway's puts ";" instead of "," to separate new fields. But actually, that doesn't bother me.

@Coboldeke:
How can I check, it's valid in my system? And what should I do then if it's not?

Btw, here's my FD:

FD OphalenBestand
record contains 50 characters.
01 OphalenRec.
88 EOF VALUE HIGH-VALUES.
03 ophalenRegel pic x(50).


Greetz Daan
 
Need to check the reference for the READ statement to see if it is not valid. Just set up a switch in Working-Storage.
 
try this.
Code:
FD CsvFile.
      01 CsvRecord PIC X(200).
                
Working storage
01  pic x.
  88 EndOfCsvFile "Y".
  88 EndOfCsvFile-N "N".

                

 OPEN INPUT OphalenBestand.
 SET EndOfCsvFile-N TO TRUE.
                
      READ OphalenBestand
      AT END
      SET EndOfCsvFile TO TRUE
      END-READ              
      IF NOT EndOfCsvFile
         DISPLAY "CSV-REC= " CsvRecord "=="
      END-IF
                
      PERFORM UNTIL EndOfCsvFile
                
      Unstring ophalenRegel delimited by ";"
      into o-medewerkerId, o-type, o-typeId, o-typeStart,
      o-typeEind
                
      Display o-medewerkerId + o-type + o-typeId + o-typeStart
      + o-typeEind
                
      READ OphalenBestand
       AT END SET EndOfCsvFile TO TRUE
      END-READ
      IF NOT EndOfCsvFile
         DISPLAY "CSV-REC= " CsvRecord "=="
      END-IF
                
      END-PERFORM
      CLOSE OphalenBestand

The displays above are for debug only

Also please state with COBOL Vendor and version you are using to run the program, and if Microfocus, which switches are you using.
Also please show us your SELECT statement.

Regards

Frederico Fonseca
SysSoft Integrated Ltd

FAQ219-2884
FAQ181-2886
 
And what about this ?
Code:
OPEN INPUT OphalenBestand
PERFORM UNTIL EOF
    READ OphalenBestand AT END
        SET EOF TO TRUE
    END-READ
    IF EndOfCsvFile OR EOF
        EXIT PERFORM
    END-IF
*> your UNSTRING stuff here
END-PERFORM
CLOSE OphalenBestand

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Hi Greetz,

One other point for you to worry about (the following assumes a fairly recent IBM COBOL Compiler):

As I recall, CSV files can have records of variable length. Yours may not because of the business aspects of the data. But others may want to keep this in mind.

The way I remember processing the recs was to use:

RECORD VARYING FROM 1 TO nnnnn DEPENDING ON WS-LEN in the FD and define the rec 01 IN-REC. Then followed by an 05 IN-DATA PIC X(001) OCCURS nnnnn TIMES DEPENDING ON WS-LEN. Then define a PIC 9(005) WS-LEN field in WS.

The DD stmt must contain BLKSIZE=nnnnn (no LRECL) RECFM=U where nnnnn is the MAX blksize expected.

Regards, Jack.

"A problem well stated is a problem half solved" -- Charles F. Kettering
 
I pesonally would code it like below. Removes the need for the pre-read and places all the necessary code into one nice tight performing routine. Provides for a single READ and a single EXIT. If additional code is needed then it is easy to add and perform the neccessay paragraphs depending if a EOF or valid record found.

Code:
FD CsvFile.
      01 CsvRecord PIC X(200).
                
Working storage
01  pic x.
  88 EndOfCsvFile "Y".
  88 EndOfCsvFile-N "N".

 OPEN INPUT OphalenBestand.
 SET EndOfCsvFile-N TO TRUE.
 PERFORM UNTIL EndOfCsvFile
    READ OphalenBestand
       AT END 
          SET EndOfCsvFile TO TRUE
       NOT AT END
          DISPLAY "CSV-REC= " CsvRecord "=="
          Unstring ophalenRegel delimited by ";"
             into o-medewerkerId, 
                  o-type, 
                  o-typeId, 
                  o-typeStart,
                  o-typeEind
          END-UNSTRING         
          Display o-medewerkerId + o-type + o-typeId + 
                  o-typeStart + o-typeEind
    END-READ
 END-PERFORM.

CLOSE OphalenBestand
The displays above are for debug only
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top