ALT KEY W/ DUPS RESULTS IN WHAT RECORD SEQ?

AustinOne · Mar 22, 2002

I am maintaining an indexed file using an antiquated compiler, etc. and terse doc, on a Unix system. The indexed file has a primary record key and a couple of alternate record keys whith dups allowed. The alt keys DO NOT contain the primary key. There are "parent" records and "child" records, guaranteed by the appropriate values within the primary key structure My problem is that a FEW records are not in the sequence I expected, when STARTing the file on an alt key and using READ NEXT to go sequentially thru the file based on the alt key (some of the children preceded the parent, even though the parent was originally written first followed by the children). So I tried a loop using primary key sequence and rewriting each record, etc. I can't seem to force/get the few records realigned in the alt key seq I want (parent followed by children), although 90% are in the seq I want. The file is heavily used with many adds and updates. 1) What ARE the rules regarding the alt key sequence of records in this type of situation? 2) And are they standard rules or compiler vendor dependent? 3) Do the rules vary? 4) And do you have any suggestions to solve my problem, other than restructuring the file so that the alt key contains the pri key? Thanks!

mrregan · Mar 23, 2002

Hi,
There must be some scheme which will allow you to pass the records to a sort and produce the desired results. What is in the alternate key of the 'parent record'? There is probably something in the parent record that points to the alternate key for the child which would require you to jump to the child record with a start command. So to read the file in order you would:

1) read until you find a parent record
2) save the primary key of the parent record
3) use a start command to read the associated child records via the alternate keys
4) restore the key of the parent record and read it directly with the primary key
5)return to step 1

slade · Mar 23, 2002

Hi A1,

I would think that within any unique value of an alt idx w/dups, the dups would be retreived in the chronological order of their creation since the last time the index was (re)built.

If the alt index were accessed immediately after the (re)build process and before any subsequent alt indexes were added, the alt index dups are retrieved in the order of their "parents'" primary key.

This dichotomy may account for your problem.

BTW, this is a mainframe view and my not apply to other platforms, though it seems to explain it.

HTH, Jack.

StephenJSpiro · Mar 23, 2002

The whole point of duplicate keys is that they are NOT ordered. The actual sequence may depend on the primary key, the creation sequence, control interval splits, buffer allocations... almost anything. If you want an alternate key to be in a particular sequence, you have to make the individual keys unique.

Stephen J Spiro

slade · Mar 24, 2002

Hi Stephen,

I disagree, the whole point of duplicate keys is that they are DUPLICATES. The associated "parent" key for the alt index is carried in the data portion of the alt index cluster. If the alt index process appends the new (updated) parent key reference to the end of the data portion of the alt index record (and I suspect this to be the case) then the order is chronalogical. If the access method does some fancy in core sorting they could make it primary key order within alt index value (but I doubt it).

I don't see where CI splits or buffers come into play. A split would occur only when a new alt index value is introduced. The buffer comment I don't understand at all.

A1 can get his wish for "parent" key order if he accesses the file immediately after it's reorganized and has not been updated, since the alt index is built from the base cluster. For example, prime key 1 contains alt key 5; prime key 1 is written to alt key 5 rec. Prime key 20 contains alt key 5; prime key 20 is written to alt key 5 rec; etc.

That's probably why he's getting his wish 90% of the time.

Regards, Jack.

StephenJSpiro · Mar 24, 2002

Jack, you are assuming (I think) that he is using VSAM. Since he said he's on a UNIX system, I doubt that he is using VSAM. In fact, it sounds as if he is using some sort of hierarchical database. Non-VSAM databases are implemented differently.
AustinOne: Have you noticed any relationships with the records in the "wrong" sequence? For instance, are they always the same combination of child and parent keys, or does it happen randomly in different builds of the test database? Is the child key always lower than the parent key, when they are "out of sequence", for instance? Is there ANYTHING you have noticed about the records in the problem?

Stephen J Spiro

k5tm · Mar 25, 2002

AustinOne,

The standard (ANSI A3.23-1985) is very explicit:

The order of retrieval from a set of records which have duplicate key of reference values is the original order of arrival of those records into that set. The START statement may be used to establish a starting point within an indexed file for a series of subsequent sequential retrievals.

So that should answer question (1). As to (2), there may be some existing indexed file implementations that are noncompliant. The rules don't vary (3), but there is no punishment for breaking the rules other than the market. Others have offered their opinions about the best way for (4), but it seems clear that if you prefer the parent record to be the first record retrieved on a duplicate key, then the parent record should be written first, followed by the child record(s).

What COBOL vendor is being used for you application?

Tom Morrison

AustinOne · Mar 25, 2002

Thanks, everyone for the responses! Being brand new to this forum, I'm overwhelmed with this idea / concept of peer / mentoring help being available!

After reading your responses and doing further research, it does appear that the alt key retrieval sequence is based on the chronological order or sequence in which the records were originally written. I tested this conclusion as follows: I retrieved the records in primary key order, and on each record, instead of just REWRITE-ing it (as I was doing before, thinking this alone would fix it), I did a DELETE immediately followed by a WRITE. This seemed to put them back in the desired alt key order, when then processing based on alt key.

I'm still wondering how they got out of order in the first place, as theoretically at least, they can not be originally written in any order other than parent, child, child, etc. However, later applications may have comprimised this scheme by messing with the keys, doing deletes and later writes, etc. (i.e., famous last words).

As far as the implications and long-term impact on the application itself, I'm going to do some more research to determine the best approach to fixing this: 1) I could create an extract key-sequence file (with a sort or index key of AltKey-PriKey) as suggested, and then use this as a "driver" whenever I need to process the main file in alt key sequence (this tends to moot the original plan for the alt key). 2) I could create a spot-fix approach as I did above when proving the theory. 3) I could convert and reorganize the main file into a new version where the alt key includes the pri key as it's low order field. One of the things I want to research is what happens when I use the vendor - supplied ISAM utility to rebuild the file (i.e. is the alt key sequence guaranteed to be the same as it was before the rebuild?, etc.). Incidentally, it's an old version of mbp COBOL.

Anyway, thanks again to everyone for being a sounding-board and confirmation source, at the very least. I want you to know that you helped me reach a direction and a solution for a real-world problem.

k5tm · Mar 25, 2002

AustinOne,

(Does that mean you're in Austin, TX?)

I would be interested in knowing if the rebuild utility does indeed retain the temporal information. If so, I congratulate the implementor for paying such close attention to this particular detail.

You see, at the implementation level there really is no such thing as a duplicate key, since the key order must be maintained according to the presentation order. So, the timestamp of the presentation is logically appended at the low-order end of the key, thereby eliminating duplicates. Too bad COBOL provides no way (other than the implicit mechanism previously described in this thread) to get back that information.

By the way mbp COBOL was purchased many years ago by Micro Focus.
Tom Morrison

AustinOne · Mar 25, 2002

1) No, I'm not in/from Austin, TX, although I hear it's a good place to live/work.
2) The dinosaur version of mbp I have predates MicroFocus's (now Merant?) takeover of mbp.
3) I suspect that the rebuild util will honor the embedded timestep, but I won't know for sure till I test it.
4) Lessons Learned: If you are absolutely dependent on having them in the desired order, then design the record layout so that the Alt-Key includes the Pri-Key (the version I have does not allow split (non-contiguous) keys. It would probably be most portable that way, too. BTW, my "fav" is AcuCobol, but I do try to keep an "open" mind (pun intended). (I realize you're a "vendor" of some type).

slade · Mar 25, 2002

You're right, Stephen. I keep thinking I'm in the mvshelp ng.

Jack

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

ALT KEY W/ DUPS RESULTS IN WHAT RECORD SEQ?

AustinOne

Programmer

mrregan

MIS

slade

Programmer

StephenJSpiro

Programmer

slade

Programmer

StephenJSpiro

Programmer

k5tm

Programmer

AustinOne

Programmer

k5tm

Programmer

AustinOne

Programmer

slade

Programmer

Similar threads

Part and Inventory Search

Sponsor