Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

IBM S80 sytem reboot by itself 1

Status
Not open for further replies.

panicky

Technical User
May 2, 2004
2
US
Dear All,
our sys. S80 os 4.3.3 was reboot by itself when all users are using it.This's the 3rd time we encounter this problem.
After the reboot,every thing back to normal without human intervention.The mess.may have cue.pls interpret for me. Tq.
> PIDS/5765C3403 LVLS/430 PCSS/SPI1 MS/700 FLDS/coalesce VALU/7c8e7008 FLDS/net_free VALU/340
----- logsymptom 1750 Fri Feb 15 12:22:31 WAUST 2002
------ logsymptom Fri Feb 15 12:24:17 WAUST 2002
Status:0
Dump copy filename: /var/adm/ras/vmcore.1
> PIDS/5765C3403 LVLS/430 PCSS/SPI1 MS/300 FLDS/remque VALU/90040000 FLDS/m_free m VALU/c0
>
------ logsymptom 1880 Mon May 20 16:47:11 WAUDT 2002
------ logsymptom Mon May 20 16:48:58 WAUDT 2002
Status:0
Dump copy filename: /var/adm/ras/vmcore.2
> PIDS/5765C3403 LVLS/430 PCSS/SPI1 MS/700 FLDS/coalesce VALU/7c8e7008 FLDS/net_free VALU/340
 
try setting more net memory (it has smth to do with coalesce and m_free) and check if swap is sufficient

command is
no -o thewall=(ramkilobytes/10)

 
The "vmcore" bit looks suspiciously like a lack of virtual memory (swap space), a suggestion is monitor this regularly in case it may be a lack of swap.

lsps -s should do it for you. IBM Certified Specialist - MQSeries
IBM Certified Specialist - AIX 5 pSeries System Administration
 
Dear all,
TQ for the wonder and fast responses.My replies to
all your good suggestions as follow.

1.As usual the vendors will collect the dumpfile but no actual answer to the cause of reboot.
2.command : # no -o thewall={ramkilobytes/10}
0821-059 no: The ioctl SIOCSNETOPT system call failed.
Dear Gheist ,pls help on the err messages.
3.For Dear Murderer, hope u'll murder this ghost for me.
Any comments on the stat display after issue the command.
# lsps -s
Total Paging Space Percent Used
8080MB 2%
 
sysdumpdev -L to see if u have a valid dump.
What level of aix are you at ? (ML 9)
If yes, then please set
udp_pmtu_discover = 0
tcp_pmtu_discover = 0
with no command.

no -o udp_pmtu_discover=0
no -o tcp_pmtu_discover=0
 
STAR for you , but add

no -o tcp_sendspace=32768
no -o tcp_recvspace=32768
no -o udp_sendspace=32768
no -o udp_recvspace=32768
no -o clean_partial_conns=1

correct place to set it up at boot is
at end of (near sendspace setting) of

/etc/rc.net
 
thx gheist,
but what is clean_partialconns ? is it for dead sockets ?
if this is that, then thats great !
 
some keepalive settings can help drop dead sockets more often
 
yep i know, but there are drawbacks (for some db connections for example)
 
but there is some balance like 5 minutes (600) or so for keepintvl
 
lsattr -El sys0 -a autorestart

If this is "true", the system will reboot after a crash. If this happens, you won't get to see the error codes on LED display, which could lead you to the error. If it is true, do a "chdev -l sys0 -a autorestart=false"

Check the error log. Look for a "CPU Failure Predicted". Our S80 gave us this error one time.

Bill.
 
Did you literally type this string?

no -o thewall={ramkilobytes/10}

Or did you sub a number into it?

o14777@box:/utc/home/o14777>no -a|grep wall
thewall = 1048552

I'm not someone who has tinkered with this setting, but I think it takes a numerical arg:

no -o thewall=1048552

Also, we have had hardware trouble with our S80s, far too many of them actually. Most required replacement CPU boards. If you have a hardware engineer, have him look for deconfigured processors.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top