Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

COBOL and .NET Data Types

Status
Not open for further replies.

montons

Programmer
May 19, 2003
2
IT
Dear Sir,
we've a procedure tha uses 127 programs written in COBOL that run on HOST
and we are using the same sources on PC-Windows2000 with some little modifies.

We have a problem when the data is passed from PIC X -> System.String and
from System.String -> PIC X. We can't change PIC X to PIC N(ational) that
manages Unicode. The modify

To simulate this problem we have made this little sample.
The test program use a pure ascii file (not unicode) in input called
CIRILIN.txt and write in output the data that it reads.

Follow the source:

IDENTIFICATION DIVISION.
PROGRAM-ID. Program1 AS "CirilConsole.Program1".
ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
SPECIAL-NAMES.
DECIMAL-POINT IS COMMA.

REPOSITORY.
CLASS SYS-STRING AS "System.String".

INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT CIRILIN ASSIGN TO 'c:\CirilConsole\CIRILIN.txt'.
SELECT CIRILOUT ASSIGN TO 'c:\CirilConsole\CIRILOUT.txt'.

DATA DIVISION.
FILE SECTION.
FD CIRILIN LABEL RECORD IS STANDARD
BLOCK CONTAINS 1 RECORDS
DATA RECORD IS REC-CIRILIN.
01 REC-CIRILIN PIC X(80).

FD CIRILOUT LABEL RECORD IS STANDARD
BLOCK CONTAINS 1 RECORDS
DATA RECORD IS REC-CIRILOUT.
01 REC-CIRILOUT PIC X(80).

WORKING-STORAGE SECTION.
01 AREA-STRING OBJECT REFERENCE SYS-STRING.
01 WRK-FINE-PGM PIC X.

PROCEDURE DIVISION.

OPEN INPUT CIRILIN.
OPEN OUTPUT CIRILOUT.

READ CIRILIN.

DISPLAY 'READ CIRILLICO ' REC-CIRILIN.

-----------------> begin HERE IT WRONGS <---------------------------
SET AREA-STRING TO REC-CIRILIN.
-----------------> end HERE WRONGS <---------------------------

SET REC-CIRILOUT TO AREA-STRING.

WRITE REC-CIRILOUT.

DISPLAY 'WRITE CIRILLICO ' REC-CIRILOUT.

CLOSE CIRILIN
CIRILOUT.

DISPLAY 'CLOSE > '.

ACCEPT WRK-FINE-PGM.

STOP RUN.


Here, the program saves data in a file in a wrong way.

Follow CIRILIN.txt, Input file:
[îààÿîþþúþéýéúéýó3îîæîæ5467îýÿüàÿüúàÀóÿÜß122344ßÎÆÆÎÀÃÆ]


Follow CIRILOUT.txt, Output file:
[35467122344 ]

How can convert System.String to PIC X in the right way ?
Should I pass through binary data ?
Is there a solution ?

Thanks in advance.
 
Attached is a sample routine to convert values from this you can
determine how to convert back to ASCII as well. Also here is a response
from the tech that wrote the utility:


Our .NET product represents all strings as Unicode (UTF-8) encoding, and
only provides limited support for ACP (ANSI Code Page) characters in
terms of reading and writing files. In order to use ACP encoded
characters in your source file, the PIC X fields must be large enough to
accommodate the UTF-8 encoded version of the string. We can provide
functions to help read in the ACP encoded data, and convert it to its
UTF-8 encoded counterpart (e.g. for displaying purposes), as well as
functions to convert the UTF-8 encoded string back to ACP encoding for
writing to the file.

Note, these functions are not intrinsic, and are contained in a separate
assembly that you would reference in the project needing this
functionality.

The .Net framework is centered on using UTF-8 encoding, because it is
highly portable (as opposed to ACP encoding). We are working on
solutions to facilitate handling of ACP encoded data better. Note that
with the solution I have described above, you will need to ensure that
the PIC X fields have sufficient room for the extra bytes needed by some
ACP encoded characters to be represented by their UTF-8 counterparts.

The best solution to the problem is to convert the data to UTF-8
encoding, and change the PIC X fields to PIC N. This will eliminate the
need for special routines to be used to read the data, and will ensure
that all .NET framework calls using string values will behave as
expected. Regardless of the solution used, it will require code change
to accommodate either method.

Thanks to Rick Malek a Systems Engineer in Fujitsu Software (COBOL).

/*
* Copyright 2003 Fujitsu Software Corporation.
* Author: Kelly Hollis
* Last Modified: 5/19/03
*/
using System;
using Fujitsu.COBOL;
using Fujitsu.COBOL.Runtime.OmeLib;
using System.Text;
using System.Globalization;

namespace CobolCodepageConverter
{
/// <summary>
/// Encoder class providing a method for converting an ANSI Code Page string
/// to it's UTF-8 Encoding. It performs all the operations in place, in the
/// COBOLData object that is passed in via a pointer. It is important to make
/// sure that the PIC X string that is used has enough space for the resulting
/// string. As a rule, use 2X the size of the original PIC X, since UTF-8 are
/// stored as a double-byte character string, whereas ACP strings are stored
/// single-byte.
/// </summary>
public class Encoder
{

/// <summary>
/// Encodes a COBOL PIC X field containing ANSI Code Page characters to
/// a UTF8 encoding.
/// </summary>
/// <param name=&quot;CobolDataPointer&quot;>A COBOL Pointer to the PIC X field to convert</param>
/// <param name=&quot;ACPLength&quot;>The length of the PIC X field that 'results' is pointing at</param>
public static void EncodeAcpToUtf8( int CobolDataPointer, int ACPLength ) {
COBOLData src_data = hPointerManager.hPointerToCOBOLData((uint)CobolDataPointer);

Encoding ACP = Encoding.GetEncoding( CultureInfo.CurrentCulture.TextInfo.ANSICodePage );
Encoding UTF8 = Encoding.UTF8;

byte [] UTF8Bytes = Encoding.Convert(ACP, UTF8, src_data.Block, src_data.Offset, ACPLength);
Array.Copy(UTF8Bytes, 0, src_data.Block, src_data.Offset, ACPLength);
}

}
}



IDENTIFICATION DIVISION.
PROGRAM-ID. Program1 AS &quot;TestApplication.Program1&quot;.
ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
SPECIAL-NAMES.
REPOSITORY.
CLASS ENCODER AS &quot;CobolCodepageConverter.Encoder&quot;
.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT STR-FILE ASSIGN TO &quot;Input.txt&quot;
ORGANIZATION IS LINE SEQUENTIAL.
DATA DIVISION.
FILE SECTION.
FD STR-FILE
RECORD CONTAINS 10 CHARACTERS.
01 FILE-RECORD-STRING PIC X(10).
WORKING-STORAGE SECTION.
01 TEMP-STRING PIC X(20).
01 STRING-LENGTH PIC S9(9) comp-5 value 20.
01 MEMORY-POINTER USAGE POINTER.
01 INT-POINTER REDEFINES MEMORY-POINTER PIC S9(9) comp-5.
PROCEDURE DIVISION.
* First, read the line of text from the input file.
* When read, FILE-RECORD-STRING will contain an ANSI
* Code Page string
OPEN INPUT STR-FILE
READ STR-FILE
CLOSE STR-FILE.
* Move the read string into a larger buffer (so it can be converted
* to a UTF8 encoded representation).
MOVE FILE-RECORD-STRING TO TEMP-STRING
SET MEMORY-POINTER TO ADDRESS OF TEMP-STRING
* Pass the integer containing the address of the temporary string
* to the EncodeAcpToUtf8 method, along with the length of the temporary
* string.
INVOKE ENCODER &quot;EncodeAcpToUtf8&quot; USING INT-POINTER STRING-LENGTH
* The displayed result should be the UTF8 encoded string.
DISPLAY TEMP-STRING.

END PROGRAM Program1.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top