Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

export data from Cobol data files 1

Status
Not open for further replies.

piterb

Programmer
Oct 14, 2008
11
PL
Hello
I have a problem. I need to export data from Cobol data files (dat, inx) to anything else (txt, cvs, excel access, whatever). I need to do this using C# and .NET platform. But I cannot open or read this files (I tried as flat file with fixed row size and separator, I tried different data readers like SiberDataViewer or even simple notepad) if open file I can see only strange characters. I wonder I anyone know some ODBC driver or data reader which will be able to open and read this files (free or with demos, trials to check if it works witch my dat,inx files before I buy it). About files – I know that they contain information about daily incomes/outcomes, guests, rooms, services etc. in hotel chain. Hotels use Cobol application called ReHot which store data in dat(data), inx(indexes) files. Software vendor of course won’t help and application cannot export data to different files only print them on fiscal printer.
I will be very glad for any help/advice/suggestion you can offer me to solve this problem.
Piter

 
some guesses:
BP081005.DAT: lrl=774, key=X(10)=supplier or customer code
KR081005.DAT: lrl=857, key=X(02)=country code
 
what do you mean by 'key'
Consider it as the record ID, eg BP081005.DAT is indexed by the 10 first characters of the record.

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
piterb,
COBOL packed data types (or decimal) COMP-3 are BCD-encoded. i.e. every digits is encoded in a nibble (4 bits).
For example (taken from a COBOL book):
Code:
PIC 9(4) VALUE 5678 COMP-3
is stored as: 05 67 8F (Hex)

PIC S9(4) VALUE 5678 COMP-3
is stored as: 05 67 8C (Hex)

PIC S9(4) VALUE -5678 COMP-3
is stored as: 05 67 8D (Hex)

So for example, if I look at the first 2 records of your file Bp081005.dat in HEX-view, I see this (... means continuation)
Code:
1 record
'PKE SA    
...
00 36 00 0F
...

2 record
'KPMG AUDYT
...
00 19 50 0F
...
The 1. record seems to have 1. field a character field, which contains 'PKE SA' and a packed numeric field with the value 36000.
The 2. record seems to have 1. character field which contains 'KPMG AUDYT' and a packed numeric field with value 19500.

The problem is, that all fields (character and numeric) are merged together to one fixed length record of 774 bytes.

But I don't know really how the packed values with decimal point are represented, i.e. I am not sure if the above packed field
00 36 00 0F
has a value of 36000 or 360.00.

If you want to encode the packed decimals you also need to unpack the binary representation into the nibbles and then process them.
You can find some inspiration how to do it e.g. in Python here
and how to do it in Perl here
I'm sure you will find something for Java or C# too.
 
BP081005.DAT in fact has 277 records but records don’t have the same columns or 0F is not column delimiter
, = 0F
1record – 259b, 5b, 5b, 5b, 5b, 5b, 453b, 5b, 5b, 5b, 5b, 5b, = 774b
2record – 259b, 5b, 5b, 5b, 5b, 5b, 25b, 5b, 5b, 5b, 5b, 5b, 25b, 5b, 5b, 5b, 5b, 5b, 25b, 5b, 5b, 5b, 5b, 5b, 283b, 5b, 5b, 5b, 5b, 5b, = 744b
3,4,5,6,8,13,18… like 1
7,9,10,11,12,14,16,18,19… like 2
15record – 259b, 5b, 5b, 5b, 5b, 5b, 81b, 5b, 5b, 5b, 5b, 5b, 341b, 5b, 5b, 5b, 5b, 5b, = 774
And few more record types and different records appears without any regularity or repeatable structure but all rec begins with 259b, 5b, 5b, 5b, 5b, 5b,
Ok I will continue my digging and let you know if something comes up.

mikrom
I will check your idea too Thx
 
You are right: hex F is not a general field separator.
It is because only unsigned COMP-3 datatype ends with F.
But the record can contain several data: character fields, numeric fields (unpacked) and packed numeric fields.
The only thing I can see in the record is, that first field is character field with length 10. Then come probably some numeric fields but with value zeros (that can be unpacked numeric fields) and then come some numeric fields terminated by hex F, that would be the unsigned packed decimals. And all that is merged together.
E.g. the structure can have a form:
Code:
01 Bp081005-RECORD PIC X(774).
01 Bp081005-RECORD-RED REDEFINES Bp081005-RECORD.
   05 FLD01  PIC X(10).
   05 FLD02  PIC 9(10).
   ...
   05 FLD10  PIC 9(10) COMP-3.
   ...

Here PIC X(10) is chareacter field, where each char is 1 byte, PIC 9(10) is numeric fields, where each digit takes 1 byte and PIC 9(10) COMP-3 is packed numeric field where each digit takes a 1/2 byte.

Therefore IMHO the only way to decode the fields in such a record exactly, is to know the COBOL-source-structure of the record. The structures of tables are stored in the COBOL source files - so called "copybooks".
Try to find something like Bp081005.cbl or Bp081005.cpy


If you don't have the source, the only way remains is to experiment.
 
First, break the data into columns based on having each packed decimal field in its own column and the surrounding ascii strings each in its own column.

Compare the results above with reports generated by the software to further split the ascii string columns and to find the decimal point for the packed decimal strings.

If there are any columns which are neither ascii strings nor packed decimal fields, try interpreting them as 1) fixed-point binay of two, four or eight bytes (other byte lengths are possible but less common) or 2) floating-point binary of four or eight bytes.

Ascii strings may contain text data or numeric data. If an ascii numeric string is preceded or followed by a "+" or "-", that is part of the string and is the sign. An ascii numeric string may begin or end with a character which is a numeric value with a sign "overpunch". This "overpunch" would convert a non-zero character to a letter and a zero character to "{" or "}
 
mikrom
I know about copybooks but unfortunately I don’t have any .cbl or .cpy files only a lot of .cob files. I am wondering if the structure can be stored in application source code

webrabbit
Nice idea but the problem is that application generates reports based on few files not just one so I cannot compare one file to report to guess file content:/
 
piterb,
Ok, .COB is suffix for cobol source files too.
The table structure is eiher hardcoded in the COBOL program which reads the table, or more often it is copyied into the program from a copybook.

Then search in you *.COB-files first for something like this:
Code:
       INPUT-OUTPUT SECTION.
       ...
       FILE-CONTROL.
       ...
           SELECT BP081005
               ASSIGN TO DATABASE-BP081005
                   ORGANIZATION IS INDEXED
       ...
or alternative to DATABASE-BP081005 it could be DISK-BP081005 etc...

If you find that, than search in the same source for FD (File Description)
Code:
       DATA DIVISION.
       ...
       FILE SECTION.
       ...
       FD  BP081005.
You will probably find something like this
Code:
       FD  BP081005.
       01  BP081005-REC.
           COPY DDS-ALL-FORMATS OF BP081005.
(note: DDS-ALL-FORMATS is AS/400 specific, but in your case it will be only COPY something)

Then search in your *.COB files for the copybook named BP081005.COB
In this source file you will find the COBOL data record structure of the table BP081005.
 
For R/M Cobol (Liant) .COB fiels are the compiled code, not the source code.
 
Ok webrabbit,
I don't use RM COBOL, so I (wrongly) thought that it compiles to EXE like other languages (on Windows)
 
Piterb,
Your problem was interesting for me and therefore I tried to decode the COBOL binary.
As a tool I used Python, therefore I posted the program details and the results in this Python thread:

They seems to use frequently for storing of sums of money packed decimals of length 11 digits - in cbl_data.cfg identified as P:11. However, from the binary we don't know where the decimal point is. This knows only the original COBOL program. But I guess If that are sums of money, they have 2 decimal places. So, if you set in the function parse_cbl_data() the optional argument format_packed=0, then you will obtain the packed numbers in raw format at 11 digits: 99999999999. The default is option format_packed=1, which gives you then 999999999.99

The results data seems to look like this (compare that with your reports)
Code:
ABB           0    600      0     500 000 000 000        982.00    300.00    65.00
ENERGO        0    500    500     300 000 000 000        960.00    270.00    39.00
              2    404    803     430 000 000 000        28003.00    2730.00    2760.30
WASA          0    100    300     300 000 000 000        540.00    0.00    39.00
KWB KONIN     0    400    400     400 000 000 000        480.00    0.00    13.00
UNICO         0    500    500     500 000 000 000        1660.00    420.00    195.20
ELEKTROWN     0    500    700     400 000 000 000        930.00    0.00    52.00
INST.CHEM.    0    100    100     100 000 000 000        160.00    0.00    0.00
GMINA         0    401    300     880 000 000 000        120.00    0.00    26.00
BIOENERGIA    0    500    500     500 000 000 000        900.00    0.00    1434.90
DPK           0    700    900     390 000 000 000        750.00    270.00    26.00
ECOLAB        0    100    100     100 000 000 000        180.00    0.00    13.00
PRUF          0    800    200                   -        2895.00    750.00    0.00
ASKOM         0    300    300     300 000 000 000        380.00    30.00    26.00
FOSTER        0      0      0                   -        0.00    0.00    39.00
KLIMATECH     0      0      0                   -        0.00    0.00    0.00
STAL-KRAFT    0    100    100     100 000 000 000        180.00    0.00    13.00
But some zoned numbers I leaved in a big chunks, because I cannot interpret them. This will need from you further effort: in manual adjustment of the config file cbl_data.cfg according to the printed reports from original COBOL system.

It's a first kick only - maybe I maked mistake elsewhere...
 
Thank you all for help I stopped solving this problem because of deadline we have to make a deal with provider they give us piece of software which change this dat files into txt rapports and we process this.
But when finish this I probably came back here :)
Thank you once again
 
I haven't looked at the data file as yet, but you might try using "Record Edit" which is a data file editor that can accept COBOL records and data types. You can also build a record description within the application which will give you a way to browse and export the file.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top