Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

RS/6000 Model F80, unknown state with LED 0c20, no ping

Status
Not open for further replies.

aixquest

Technical User
Jan 15, 2001
53
US
I have two RS/6000 F80 servers running AIX 5.2 and 5.3. At least one time I saw both servers had abruptedly shutdown and reported 0c20 at the LED display. It appeared the hosts were hung somehow. I couldn't ping or do anything with the server. I had to hard-reset the servers for the OS to boot up again. Anybody knows what the code mean? Or what are the steps to further debug the issue?

Thanks in advance,
???
 
I already did that. Didn't see anyone poast any concrete messages to isolate the issue or know what it means. Most comments were ask IBM to look into this. I also looked at the service guide, I didn't see anything. Perhaps, I'm not looking at the right place. Also, is there a specific system log to keep track of exceptional errors? I looked at errpt and errpt -Ac logs and didn't see anything stands out.
 
have you looked at root's mail ? might find a clue there or possibly /var/log if you have syslogging enabled

Norm
 
Nope, the /var/log doesn't post anything useful. For the system logs, I think it needs to be at /var/adm/. I was mostly inspecting the logs within the /var/adm/ras/ dir. If the host is posting any oddity, I'd expect there are messages would go in the ras dir.
 
Above post is correct.
If you are not familiar with kdb I suggest you reconfigure them to dump instead and the have IBM look at the dumps for you if it happens again.
 
hi,
both servers presents the same error:

- it's hard difficult that both machines has hw broken
- too, you have low probability that both os are corrupted
(less than 10% is the probability that a system is corrupted
by a power black-out)

In a risc, the 3-Digit-Display shows before hw codes,
then it begins to show AIX codes: for the first, you have to refere to HW Machine Service Manual, for the second, to
AIX codes.

If a 3DD shows 581, means that (in AIX 4) the tcpip
has a DNS configured, and it is unreacheable (90% cable detached): the HW is perfect and if you wait 1/2 hour,
the machine starts.

If it shows F2C, probably you have processor problem.

0c20 is a SW error: AIX has been booted, but ...

c20: The kernel debugger exited without a request for a system dump. Enter the quit dump subcommand. Read the new three-digit value from the LED display.

Probably, you are not in front of the console when the machine trys to boot; you cannot try to ping it:
it will not boot until you say it that the dump (a big file)
is not important for you, and it can be discarded.
(it seems me that you have 3 choice: choosing 99, the machine discards the file, and finaly starts).

The motivation for which the machine does not start and wait for your response, is becouse this dump may contain important to inspect the motivation of crash.
The machine save the dump in paging space, and this is the
unic moment to save it on a media: if the machine start
you loose the dump.

You should send this file to service and they, by debugging
the kernel give to you an answer: but this is not your
situation:

someone has detached the power cable !

ciao
vittorio

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top