Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Uncompressing data

Status
Not open for further replies.

actuary

Technical User
Mar 18, 2003
3
US
I have a text file (something.txt) that was created from our mainframe with the following copybook:

var1-A 9(4)
var2-B X(02)
var3-C S9(7)V9(2) COMP-3
var4-C S9(11)V9(2) COMP-3
var5-C S9(13)V9(2) COMP-3

I know that VAR1-A is a 4 digit number, no decimals.
VAR2-A is a two character string.
Question 1) What are the remaining variable types,
Question 2) How do I know how many bytes they are taking up on the record? That is, their starting and ending positions (field width), and finally,
Questions 3) How can I uncompress them using VBA?

Thanks for the help.
 
Hi Actuary,

Re. VAR3 thru 5:

Each has 2 decimal places. The decimal place is implied (it is only conceptual; you won't find it in the data). The first number(s) in parens indicate(s) the number of integer digits in the field. The "S" indicates the presence of a sign (pos or neg). The "COMP-3" indicates that the field is packed decimal; that is, each digit in the field is reprsented by 4 bits (a half byte, a nibble). The sign also occupies a nibble.
Code:
To calculate the size of a dollar amount of a positive 123.45:

Count the digits in the amt = 5
Add 1 for the sign          = 1
Total them                  = 6
Divide by 2                 = 3

Lets do the same for a negative 1234.56: 

Count the digits in the amt = 6
Add 1 for the sign          = 1
Total them                  = 7
Divide by 2                 = 3.5

Since the system allocates storage in whole bytes only, round up to 4.

Here's what each amount looks like in memory:

positive 123.45  = 12345C
negative 1234.56 = 0123456D (notice the sys added a leading zero to round up)

The usual valid signs for positive are "C" & "F"; for negative it's "D".
Not sure I know what you mean by "How can I uncompress them using VBA?". What's VBA?

Regards, Jack.

HTH, Jack.
 
This was great.

In regards to your follow-up question on uncompressing via VBA, I am assuming (perhaps incorrectly) that should I look at this file, the record length would only be 26 bytes.

My thought process now, albeit flawed perhaps, is that I really need my mainframe guys to pass me this file uncompressed so that I get 46 bytes.

I wanted to use VBA (visual basic for applications, either inherent in MS/Access or MS/Excel) to read this text file and convert it to a comma delimited file myself dealing with the computational data on my end.
 
actuary:
Just a little help in terminology that may aid you in talking with your mainframe guys who create the file for you.

COMP-3 is an abbreviation for computational-3. It means packed-decimal notation as Jack was describing. When you were calling it compressed it confused me a little and might also confuse the programmers you will be talking to.

The first variable is stored in what is referred to as zoned-decimal. This format has one digit stored in each byte. Your programmers could easily define their output to have zoned decimal values rather than packed-decimal.

If you have a zoned-decimal field defined as S9(7)V9(2) it would take 9 bytes of storage. The sign(C,D,F) is stored in the leftmost 4 bits of the rightmost byte. The decimal is assumed just like with the packed-decimal.

Using one of Jack's examples, a positive 123.45 stored in that 9 byte field would be represented as:

F0 F0 F0 F0 F1 F2 F3 F4 C5

Each digit except the rightmost is represented as an unsigned digit. The decimal point is assumed to be between the 3 and the 4.

You could even get the mainframe guys to insert the commas between each of your data fields if that would make it easier....and even put in the decimal point rather than leaving it assumed. If you prefer, they could take off the 'overpunch' sign and send you an extra '+' or '-' that trails the field.
 
Thank you all. This was the information I needed. To any "mainframe guys" out there, please note that that is a term of endearment!
 
actuary -

You might also face a little problem reading this natively on the PC. An EBCDIC to ASCII conversion will normally be done on the file when it is downloaded to the PC (unless BINARY mode is specified). That does funny things to packed data.

So, you should download this in binary. However, that would then force you to do the EDBDIC to ASCII conversion on the non-COMP-3 fields in your VBA and then unpack the COMP-3 stuff. That would be UGLY, but doable.

The easiest solution from a PC standpoint is to have all of the mainframe output be in DISPLAY format with SEPARATE signs on the numeric values. Whether you also choose to get a CSV format, fixed record length, or whatever depends on how you can best handle the data and what your mainframe guys are able/willing to do for you.

Glenn
 
actuary,

The purpose for storing data in comp-3 fields on
the mainframe is twofold. One, the obvious, is less storage space and two, all calculations are performed on packed decimal aka comp-3 fields. Zoned decimal fields must be "packed" before they take part in a calculation. Every new mf programmer learns about x'40 00 00 00' (packed spaces) sooner than later.

EdP
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top