Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Why isn't my job falling over?

Status
Not open for further replies.

SuzieW

Programmer
Dec 17, 2002
41
GB
Hi all

I have a long running job (only stopped and restarted every few months) which calls a number of RPGs, including some ILE programs which have an RPG module bound to some IBM service programs (QZRDSAPI) for transmitting stuff via TCP/IP.

We had a problem where the ILE programs were created with ACTGRP *NEW instead of *CALLER and after a couple of weeks, the system ran out of activation marks or something. This manifested itself as the following messages in the job log:

Program activation not found.
Pointer not set for location referenced.
Unexpected user error occurred in QLEDAGE.
Application error. CEE9902 unmonitored by QLEAWI at statement 0000001463, instruction X'0000'.
Pointer not set for location referenced.
Unexpected user error occurred in QLEDAGE.
Application error. CEE9902 unmonitored by QLEAWI at statement 0000000038, instruction X'0000'.

So, we've reimplemented the programs with ACTGRP *CALLER and this seems to solve the problem

My question is, why didn't the job go on MSGW?

The root RPG of the job issues a CALL(E) to another RPG, which issues a straight CALL to the ILE programs, which issue CALLBs to the service programs (e.g. CALLB 'TCPOPEN'). The top level RPG traps for errors and reports them, but no errors were reported, only the stuff written to the job log.

Shouldn't the service programs return with an error condition so that the RPG can handle the error? Or is it all happening at too low a level in the call stack?

Any help would be gratefully received. As you may be able to tell, I'm not very techie when it comes to ILE or IBM APIs...

Cheers
Suzie
 
If you use CALL(E), CALLP(E), or CALLB(E), any errors in the command are ignored unless you test for them.

Code:
CALL(E) whatever;
if %error;
....
endif;

Also, if you are not specifying the activation group in the H-specs, you should do it - then the possibility that this problem will reoccur will be minimized. Here are the H-specs I use for every RPG program:

Code:
      /TITLE {program title]        
     H Copyright('© Copyright {your company}, Inc., 2005')  
     H Option(*SRCSTMT : *NODEBUGIO)                             
     H Dftactgrp(*NO) Actgrp('AROPT')                            
     H Bnddir('ARBNDDIR')                                        
     H Datfmt(*USA) Timfmt(*HMS) Alwnull(*USRCTL)

I use a named activation group; you can use *CALLER if you prefer.

Me transmitte sursum, Caledoni!

 
Hi flapeyre

The CALL(E) *is* followed immediately by:

IF %error
...
perform error logging
....
ENDIF

but that's what I'm saying, the error wasn't logged so I'm guessing that the error was "handled" further down the stack.

None of the other calls in the called programs have E extenders, so it should have fallen over anyway before it got back to the top level RPG, shouldn't it?

Am I making any sense?
 
How are you getting the information to log (that is, what are you doing in the If %ERROR loop)?

Me transmitte sursum, Caledoni!

 
Suzie:

If you are trying to reclaim an activation group, that could be your problem. Once you do that, you lose everything that was there. Just make everything ACTGRP(*CALLER).

In short, I think the error condition also broke your error-trapping routine.

Me transmitte sursum, Caledoni!

 
Here's what we do:

* - error on call?
C if %error
C eval AuditMsg = 'Error calling inkjet routine -
C UT019'
C exsr CallUT017
* - update MFAHSP scrolling msgs with error
C eval UT026Msg = AuditMsg
C exsr CallUT026
C eval UT019Error = *on
C leavesr
C endif

UT017 is an RPG (OPM) which simply writes out the error message to a file. There was nothing written to the file when the problem was occurring. Do you think it's because I'm trying to call another program? It also writes to a self-refreshing screen but nothing was spotted on there either, but there may not have been anyone nearby.

I can't see any RCLACTGRPs anywhere.

I'd already looked at some of those articles too, MdnghtPgmr (specifically It's quite possible that we need a PTF or two - will have to have a chat with the powers that be. Our DEV box is still on v4r5...

Thanks for all your suggestions, guys. Hopefully, changing them to *CALLER (which I've done) will stop that from happening again, but I'm still not sure I really understand why I didn't get an escape message.
 
I would make UT017 an ILE program with ACTGRP(*CALLER) and see what happens. OPM programs run in the default activation group. When you have an ILE program trying to call an OPM program, you can run into problems.

Read The Seven Deadly Sins of ILE, by Susan Gantner and Jon Paris.


Me transmitte sursum, Caledoni!

 
Useful link, thanks flapeyre
Looks like a combination of OPM/ILE is often problematic. Our system seems to break a few of those rules!

I'm not sure why the system was written like that - the programs were all written at the same time a couple of years ago, so they could have done it all in ILE. I think they just used it where they *had* to because they weren't experts in ILE...

 
You can get away with calling an OPM from an ILE if you are not doing any file, data area, or data queue stuff. We use some bound ILE programs that are called by OPM programs (in RPG III) just to do date calculations (it was part of our Y2K solution).

Now, I have those date programs in an ILE service program, and whenever I can, I'm converting the OPM programs to ILE and using the service program's functions in place of the old called bound programs.

But if you are doing any sort of I/O to a file, you should not be crossing activation group boundaries.

Me transmitte sursum, Caledoni!

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top