Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Serious Problems with Cobol in AIX 5.2

Status
Not open for further replies.

drinagd

Technical User
Nov 20, 2003
10
US
Hi,

We just migrated AIX 4.3.3 to AIX 5.2 (Maintenance Level 04). We have been migrating all our servers and had no problem except with one in which the application runs in Cobol. Everything worked fine for several hours, but then all of a sudden the server got very busy, we couldn’t identify any process that could be taking up all the resources in terms of processor, nor we saw any unusual paging (as a matter of fact the use of paging space and memory in general was minimal).

It reached a point in which we couldn’t kill user’s processes, nothing responded, we issued a “shutdown” command that didn’t work, so we ended up switching the server off. Once we switched it on, it booted normally, then we checked everything and we saw that nothing got registered since the moment we first noticed the server wasn’t responding, therefore we have no clues on what might have caused the problem.

Does anyone know of any issues regarding Cobol working on AIX 5.2?

How can we control that no particular user or process take up all the processor?

Thanks in advance for your help.

Best regards,

Drina
 
did you update your cobol compiler as well? you'll need to do that too.
 
My suggestion would be to examine /var/adm/wtmp and look for rapidly respawning process logins. It might be that there's some process running out of inittab that has been "broken" by the upgrade, and so is failing to start properly but respawning so quickly that your system simply cannot cope. I've come across this a couple of times, and the symptoms usually match exactly what you describe. Sometimes you're lucky enough to get an entry in errorlog, something like INIT_PROCESS_RAPID I think.

HTH

Kind Regards,
Matthew Bourne
"Find a job you love and never do a day's work in your life.
 
Hello Matthew,


Thanks for your response. Your suggestion seems logical, we will check that as soon as we manage to boot the system again, since it stopped responding again... what we saw is that this seems to happen as soon as we get close to 255 users logged in, is there a parameter that we can check/change to manage the number of users?, also we saw that shortly before the system slowed down, we checked the processes with "ps -fea" and saw an entry "0551-011 Standard input is not a tty.", does this suggest anything?

Thanks,

Drina
 
How can I extract information from /var/adm/wtmp? the file seems unreadable, is there a utility or command that would help me?.

thanks,

Drina
 
There's a utility to convert the binary contents to human-readable format

/usr/sbin/acct/fwtmp </var/adm/wtmp >/tmp/wtmp.txt

or

/usr/sbin/acct/fwtmp </var/adm/wtmp | more


HTH,

p5wizard
 
Thanks for your response, we are checking the wtmp file, but nothing seems odd. What we were able to determine is that last night, we had very few users, and still the resources were taken in such a way that the system was very slow to respond, we found a few processes like these:

mut01001 275826 275566 40 17:03:08 - 34:43 runcobol SCP001 -k -a tty: 0551-011 Standard input is not a tty.
mut01001 278146 277882 40 17:03:11 - 31:10 runcobol SCP001 -k -a tty: 0551-011 Standard input is not a tty.
bcr01001 278662 32016 38 17:04:35 - 31:34 runcobol SCP001 -k -a tty: 0551-011 Standard input is not a tty.
csg01001 279180 278920 32 17:04:42 - 31:10 runcobol SCP001 -k -a tty: 0551-011 Standard input is not a tty.

The time that they use the processor is way to excessive (over 30 minutes), and these processes belong to users that were no longer logged in, they were probably killed due to inactivity time, somehow these processes ended up as orphans eating up the CPU, once we killed those, the resources got available. I can’t find anything on the Internet about what causes this problem, and it is strange that it doesn’t happen in AIX 4.3.3 (we got back to AIX 4.3.3 in our production server), but kept one of our servers in AIX 5.2 until we can determine how to control this.

Has anyone come across this sort of problem?

Best regards,

Drina
 
Upgrade you compiler to one compatible with the level of OS. i know that sounds like something IBM support would tell you, but it's really the first step.

it's easy to find problems with old compilers running on newer levels of OS... until you get the two in line, you could be chasing ghosts around. seen it a few times myself.

appologies if you've already done this.
 
seems to me the login script for the users fires up a runcobol process that needs a parameter -a <tty_name>

this tty name is generated from the command tty, but in a background process, this command does not generate a /dev/ttyX or /dev/pts/Y string, but the error message "tty: 0551-011 Standard input is not a tty."

Please show the .profile script contents in the user's login directory and let's have a look.


HTH,

p5wizard
 

Your hint was right, Breslau, our version of Cobol is too old for the newer OS, we will get an upgrade.

P5wizard, this is the contents of the .profile, and the executable script that invokes Cobol:

Contents of .profile
trap "" 2 3 5 15
PATH=/usr/bin:/etc:/usr/sbin:/usr/ucb:$HOME/bin:/usr/bin/X11:/sbin:.
export PATH
exec /bin/R


Contents of file /bin/R
TERM=AT386-M
export TERM
cd /home2/CRE/RUNTIME
runcobol SCP001 -k -a "`who am i| cut -c 1-3; tty`"

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top