Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Create a CSV Record 3

Status
Not open for further replies.

BrianTyler

IS-IT--Management
Jan 29, 2003
232
GB
I have been using Cobol for nearly over 35 years, and I am still amazed at the short-sightedness of the language developers. (My biggest gripe has always been the handling of dates - by now there should be a field type for dates, and data arithmetic).

My current problem is the creation of a CSV style of record from a number of data items, many of which have trailing spaces. Obviously, I could STRING all of the items together into a character array, and then have a clever indexing routine to shuffle up the data, removing spaces (I'll live with leading zeros).

Has anyone a good solution to this problem.

I am using Cobol-85 on AIX.

Brian
 
Depending on the vender you will have the following options.

1- Use intrinsic functions to determine the size of each string before stringing them into the final record.
2- Use a perform varying

After getting the string size then do

string mystring(1:size) delimited by size
into my_out_rec
pointer my_pointer

Obviously more stuff needs to be done to catter for empty strings, but there isn't an out of the box solution with COBOL.

Regards

Frederico Fonseca
SysSoft Integrated Ltd
 
This really isn't a problem that's specific to COBOL, but would have to be dealt with with any language (the idea of fixed text values versus dynamic text values).

Anyway, here's the basic idea:
Code:
IDENTIFICATION DIVISION.
 PROGRAM-ID. TESTDATA.
 AUTHOR. GLENN9999ATTEK-TIPSDOTCOM.

 DATA DIVISION.
 WORKING-STORAGE SECTION.
 01  INPUT-DATA1      PIC X(20) VALUE "SCOTT WRIGHT".
 01  INPUT-DATA2      PIC X(20) VALUE "321 WALNUT STREET".
 01  OUTPUT-DATA      PIC X(90).
 01  PROCESSING-VARIABLES.
     05  PROC-NDX     PIC S9(4) COMP-5.
     05  CHAR-CNT     PIC S9(4) COMP-5.

 PROCEDURE DIVISION.
 0000-MAIN SECTION.
     MOVE SPACES TO OUTPUT-DATA.
     MOVE 1 TO CHAR-CNT.
*
     MOVE 20 TO PROC-NDX.
     PERFORM UNTIL INPUT-DATA1 (PROC-NDX:1) NOT = SPACES
        SUBTRACT 1 FROM PROC-NDX
     END-PERFORM.
     MOVE INPUT-DATA1 (1:PROC-NDX) TO OUTPUT-DATA (CHAR-CNT:).
     ADD PROC-NDX TO CHAR-CNT.
     MOVE ',' TO OUTPUT-DATA (CHAR-CNT:).
     ADD 1 TO CHAR-CNT.
*
     MOVE 20 TO PROC-NDX.
     PERFORM UNTIL INPUT-DATA2 (PROC-NDX:1) NOT = SPACES
        SUBTRACT 1 FROM PROC-NDX
     END-PERFORM.
     MOVE INPUT-DATA2 TO OUTPUT-DATA (CHAR-CNT:).
     ADD PROC-NDX TO CHAR-CNT.
*
     DISPLAY OUTPUT-DATA.
     GOBACK.

I'm sure you can see an emerging pattern to be able to write some subroutines (usually how most languages work this problem out) if you plan on doing a lot of this stuff.
 
Thanks for your input.

It just shows how out-of-date Cobol has become. I remember coming across the problem in 1967 when trying to write to paper-tape. It is a little easier now due to being able to acces characters within a field, but I still feel that the Cobol designers have been negligent.

Many times, cobol has to create structures to be exported to routines in string-based langauges (C, VB etc) and to terminals / comms yet the tools are not provided.

I accept that it is not difficult to code the routines, but it should be totally unnecessary by now.

Sorry, I've started ranting again, I'd better take a tablet.

brian
 
Not entirely unnecessary...the only difference between COBOL and other languages is that the other languages provide this functionality as a prewritten subprogram.

In fact, I thought about it and it shouldn't be difficult to create a subprogram to concatenate two text items with a comma or whatever your CSV separator is. If you want more than just two items, keep passing your CSV line to the subroutine with new values.

Not out of date, just a little more bare bones. But it still can do the job.
 
Brian,

I hope you have taken the tablet, and that it has had the desired effect. :-D

I cannot help but be amused that you are (to use your word) "ranting" about the inability easily to create, using antiquated COBOL, a CSV file! Again... :-D

If you are using RM/COBOL or Micro Focus COBOL on your AIX box, then you could use Relativity to provide SQL access to your COBOL data so that VB, et cetera, may use the data directly without using an antiquated CSV file.

Better yet, why not use the modern approach to data exchange: XML?

Tom Morrison
 
A) CSV files "tend" to be limited to "workstation" environments (Windows, Unix, Linux, etc) - which was certainly NOT where "COBOL" was originally targetted.

B) "CSV" is certainly NOT "state of the art". XML files (input and output) is supported by a number of COBOL compilers (as an extension) today. Theres is a TR (Technical Report) currently out for review for adding this facility to ISO conforming COBOL

C) The next (after 2002) COBOL Standard (currently scheduled for 2009) has proposals for "variable" length fields and group items. (Unlike current Occurs Depending On - which actually has a "fixed" amount of storage allocated).

***

In general, when talking about "how out of date" COBOL is, make certain that you are aware of what is actually in the 2002 COBOL Standard *and* what is currently under development for the next Standard.

THEN check with your "vendor of choice" on
- their implementation of the '02 Standard
- their involvment in the devlopment of the next COBOL Standard.

For an overview of current Standards work, see:

For papers currently under development and/or review see:

P.S. IBM introduced "date" type data types in 1996 (or so - before Y2K) and these have been VERY little used.

Check out

and related pages for how these "date" features were implemented.


Bill Klein
 
Code:
       IDENTIFICATION DIVISION.
       PROGRAM-ID.    CSV.
       AUTHOR.        CLIVE CUMMINS.
       INSTALLATION.  TUBULARITY.
       DATE-WRITTEN.  MAR 23, 2005.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  WORK-AREAS.
         05  A-SUB        PIC 9(4) COMP.
         05  B-SUB        PIC 9(4) COMP VALUE 1.
         05  A-STRING.
           10  FILLER     PIC X(02) VALUE '"'.
           10  FILLER     PIC X(10) VALUE 'WILL'.
           10  FILLER     PIC X(03) VALUE '","'.
           10  FILLER     PIC X(10) VALUE 'THIS'.
           10  FILLER     PIC X(03) VALUE '","'.
           10  FILLER     PIC X(10) VALUE 'DO'.
           10  FILLER     PIC X(02) VALUE '",'.
         05  FILLER REDEFINES A-STRING.
           10  A-CHAR     PIC X(01) OCCURS 40 TIMES.
         05  B-STRING.
           10  B-CHAR     PIC X(01) OCCURS 40 TIMES.
       PROCEDURE DIVISION.
           DISPLAY A-STRING.
           PERFORM VARYING A-SUB FROM 1 BY 1
             UNTIL A-SUB GREATER THAN 41
             IF A-CHAR (A-SUB) NOT EQUAL SPACE
                MOVE A-CHAR (A-SUB) TO B-CHAR (B-SUB)
                ADD +1 TO B-SUB
             END-IF
           END-PERFORM.
           DISPLAY B-STRING.
           GOBACK.

Clive
 
Brian Tyler said:
Obviously, I could STRING all of the items together into a character array, and then have a clever indexing routine to shuffle up the data, removing spaces (I'll live with leading zeros).

You don't need any "clever indexing routine"! STRING can do the whole job for you! Or I missed something.

Code:
01 WS-FIELDS.
   03 WS-FIELD-DATA.
      05 FILLER PIC X(40) VALUE "Will".
      05 FILLER PIC X(40) VALUE "this".
      05 FILLER PIC X(40) VALUE "do".
      05 FILLER PIC X(40) VALUE "Brian Tyler".
      05 FILLER PIC X(40) VALUE SPACES.
   03 WS-FIELD-TABLE  REDEFINES WS-FIELD-DATA.
      05 WS-FIELD-ENTRY   OCCURS 5 TIMES
                          INDEXED BY X-WS-FE.
         07 WS-FIELD  PIC X(40).

01 WS-FIELD-COUNT     PIC S9(4) COMP
                                VALUE +5.
01 WS-STRING          PIC X(80) VALUE SPACES.
01 WS-STRING-LENGTH   PIC S9(4) COMP
                                VALUE ZERO.

MOVE 1 TO WS-STRING-LENGTH
PERFORM
    VARYING X-WS-FE FROM 1 BY 1
    UNTIL   X-WS-FE > WS-FIELD-COUNT
    OR      WS-STRING-LENGTH
                    >= LENGTH OF WS-STRING
    OR      WS-FIELD (X-WS-FE) = SPACES
       STRING
           QUOTE      DELIMITED BY SIZE
           WS-FIELD (X-WS-FE)
                      DELIMITED BY "  "
           QUOTE      DELIMITED BY SIZE
           ","        DELIMITED BY SIZE
                INTO WS-STRING
                POINTER WS-STRING-LENGTH
       END-STRING
END-PERFORM
SUBTRACT 1 FROM WS-STRING-LENGTH
DISPLAY WS-STRING (1:WS-STRING-LENGTH)

STRING and UNSTRING are my best friends!

Code what you mean,
and mean what you code!
But by all means post your code!

Razalas
 
Yes, if I change your 4th filler line to PIC X(12), I break your code, razalas....
 
Razalas,

Yes, this is the basis for a solution, but it does not cater for long fields (though the extra spaces in these case are not going to change the total file size very much), and unless the data is held in an array of same-length fields subscripting/indexing cannot be used.

The other problems are that there could be an embedded string of three or four spaces within a field, and numeric fields are not catered for.

The 'old' solution was to use EXAMINE WS-FIELD (X-WS-FE) REPLACING TRAILING SPACES BY LOW-VALUES and then to use LOW-VALUES as the delimiter, but we have lost this verb


The other problem is Packed or Binary fields, which must be converted before the STRING command.

My general point is still true - in Cobol there is too much code required to do these 'techy' things when the developers time should be spent solving the business problem.

If a version of the STRING command allowed a 'DELIMITED BY TRAILING char', and automatically converted numeric fields to strings a lot of code would be saved.

Thanks again for your input.

Brian

 
Glen9999,

If you "change your 4th filler line to PIC X(12)," the code would not compile cleanly anyway. What's your point?

I used a table just to show that you could process an arbitrary # of fields in a simple manner. There wasn't any description/requirements given for how the fields were being stored.

The heart of the matter is the use of the STRING command, not how the fields were stored. You could use the same concept with individual fields of varying lengths and pictures, rather than fields in a table. In fact, a more robust solution, would expand the entries in the table to include additional descriptor info for each field entry (i.e. field type, field length, field delimiter, etc.). I purposely was trying to keep it simple.


Code what you mean,
and mean what you code!
But by all means post your code!

Razalas
 
BryanTyler said:
Many times, cobol has to create structures to be exported to routines in string-based langauges (C, VB etc) and to terminals / comms yet the tools are not provided.
emphasis added


Brian, there are tools that can help, as I stated earlier (perhaps it was lost in all that humor!). Rather than continue to opine about what COBOL is lacking (and, if the truth be told, we could be having a flame war about the lack of COBOL data types in these other languages), please consider using a tool more suited to the job.

Tom Morrison
 
CliveC,

I believe you mean less than 2 trailing spaces. But as in the example I made the fields rather large for specifically this reason. Naturally, this could be a problem if you have extremely long fields and have no way of knowing the upper bound of their length in advance.

A bigger problem is the problem identified by Brian himself. That of embedded spaces. I traditionally delimit alphanumeric strings with 2 spaces. In my experience, this covers 99% of the cases of embedded spaces (i.e. my experience is that most fields typically have at most single embedded spaces). If you don't know how many embedded spaces you will encounter, then you have a different problem that requires a different solution.


Code what you mean,
and mean what you code!
But by all means post your code!

Razalas
 
Brian,

Surely if you are writing output to a CSV file, you are converting your packed and binary fields anyway.

So it really boils down to the lack of a simple way to find the length of a string. I know that topic has been discussed in this forum before. Unfortunately I don't remember any solutions that didn't require multiple verbs/steps. I would have to agree that this is one area in which other languages with builtin string support are better than COBOL. Still the issue of finding the length of the string need not require an excessive amount of code.

Frankly, I like your suggestion of the "DELIMITED BY TRAILING char" option for the STRING verb. As has been suggested under other topics in this forum, perhaps if enough of us contact the COBOL standards committee it could be considered.

Code what you mean,
and mean what you code!
But by all means post your code!

Razalas
 
Tom,

Don't get me wrong. I still believe that Cobol is by far the most appropriate language for data processing. Other languages such as C and VB are fine in their own fields - quick subroutines, presentation layers etc.

The most amazing thing about Cobol is the ease of reading someone else's code. It seems that the architecture of a program being based very much on the data structures, and most site standards requiring meaningful field names, encourages discipline in programs.

Although I have written C and VB, I still find it incredibly difficult to read strange code and they are certainly not designed for the reading / writing of files. They are almost 'write-only' languages, and I do not think them as suitable for large scale batch applications as Cobol.

The biggest difficulty Cobol faces is one of image. Too many consultants etc seem to be afraid of promoting Cobol for new projects though it is normally the best tool for the back-end of most applications.

Brian

 
Brian,

Okay...we are agreed on that point. [cheers]

Now...what are you really trying to accomplish? (I presume that the CSV file is a means to an end. Knowing what your goal is might elicit better suggestions.)

Tom Morrison
 
If you "change your 4th filler line to PIC X(12)," the code would not compile cleanly anyway. What's your point?

The point I was trying to make is that your use of two spaces breaks the code if something happens to cause only one space. Perhaps better to the example is if I come up with a 39 byte field for your PIC X(40).

Basically for your code, you are presupposing too much. While it might work, you have to be sure that the assumptions of things you make that will never happen will never happen. Of course, we know what ol' Murphy had to say about that one. I don't like 3am calls or being called on the carpet for an inaccurate run and like to design code so I severely limit those from happening.
 
Tom,

My actual requirement is as simple as I said. I need a CSV file to FTP to a customer, who will import it into a statistical package.

I have done the job using string to combine the fields and the result is fine. My only gripe is that I had to do some work, when I would prefer the compiler to do it for me.

I don't like the way I have done it, it is not generalised. I will possibly write a subroutine to do it if there appears to be a future need.

Thanks to everyone.

Perhaps we should close this thread now.

Brian
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top