Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Server Crashed Today Need Help

Status
Not open for further replies.

100mbs

MIS
Feb 14, 2002
142
US
I have Sun E-4500 server that crashed today.

If anybody can help me out with this message log i would appreciate it. I am not sure what it is telling what happened.

Oct 13 16:36:03 myserver SUNW,UltraSPARC-II: [ID 632919 kern.warning] WARNING: [AFT1] WP event on CPU8, errID 0x0045d2a9.9c0
cc094
Oct 13 16:36:03 myserver AFSR 0x00000000.00800008<WP> AFAR 0x0000007f.5dd9def0
Oct 13 16:36:03 myserver AFSR.PSYND 0x0008(Score 95) AFSR.ETS 0x00 Fault_PC 0xfb50e9cc
Oct 13 16:36:03 myserver UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 815649 kern.warning] WARNING: [AFT1] Uncorrectable Memory Error on CPU5 Dat
a access at TL=0, errID 0x0045d2ab.fb9dfacf
Oct 13 16:36:13 myserver AFSR 0x00000000.80200000<PRIV,UE> AFAR 0x00000000.fe8a9d58
Oct 13 16:36:13 myserver AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x10025328
Oct 13 16:36:13 myserver UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0203<UE> UDBL.ESYND 0x03
Oct 13 16:36:13 myserver UDBL Syndrome 0x3 Memory Module Board 2 J3100 J3200 J3300 J3400 J3500 J3600 J3700 J3800
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 529255 kern.warning] WARNING: [AFT1] errID 0x0045d2ab.fb9dfacf Syndrome 0x3
indicates that this may not be a memory module problem
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 790240 kern.info] [AFT2] errID 0x0045d2ab.fb9dfacf PA=0x00000000.fe8a9d58
Oct 13 16:36:13 myserver E$tag 0x00000000.1ac01fd1 E$State: Exclusive E$parity 0x0d
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000000.00000000
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x00000000.00000000
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x00000000.00000000
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x18): 0x00000000.10000000 *Bad* PSYND=0x0
0ff
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x00000001.f7000670
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000001.00000000
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x00000001.f7027ff8
Oct 13 16:36:13 myserver unix: [ID 321153 kern.notice] NOTICE: Scheduling clearing of error on page 0x00000000.fe8a8000
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 265390 kern.info] [AFT3] errID 0x0045d2ab.fb9dfacf Above Error detected by
protected Kernel code
Oct 13 16:36:13 myserver that will try to clear error from system
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 916683 kern.warning] WARNING: [AFT1] Uncorrectable Memory Error on CPU5 Dat
a access at TL=0, errID 0x0045d2ab.fddc38ed
Oct 13 16:36:13 myserver AFSR 0x00000000.80200000<PRIV,UE> AFAR 0x00000000.fe8a9d58
Oct 13 16:36:13 myserver AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x10025328
Oct 13 16:36:13 myserver UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0203<UE> UDBL.ESYND 0x03
Oct 13 16:36:13 myserver UDBL Syndrome 0x3 Memory Module Board 2 J3100 J3200 J3300 J3400 J3500 J3600 J3700 J3800
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 374472 kern.warning] WARNING: [AFT1] errID 0x0045d2ab.fddc38ed Syndrome 0x3
indicates that this may not be a memory module problem
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 732702 kern.info] [AFT2] errID 0x0045d2ab.fddc38ed PA=0x00000000.fe8a9d58
Oct 13 16:36:13 myserver E$tag 0x00000000.1ac01fd1 E$State: Exclusive E$parity 0x0d
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000000.00000000
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x00000000.00000000
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x00000000.00000000
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x18): 0x00000000.10000000 *Bad* PSYND=0x0
0ff
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x00000001.f7000670
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000001.00000000
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x00000001.f7027ff8
Oct 13 16:36:13 myserver unix: [ID 321153 kern.notice] NOTICE: Scheduling clearing of error on page 0x00000000.fe8a8000
Oct 13 16:36:13 myserver SUNW,UltraSPARC-II: [ID 341280 kern.info] [AFT3] errID 0x0045d2ab.fddc38ed Above Error detected by
protected Kernel code
Oct 13 16:36:13 myserver that will try to clear error from system
Oct 13 16:36:38 myserver SUNW,UltraSPARC-II: [ID 621182 kern.warning] WARNING: [AFT1] Uncorrectable Memory Error on CPU4 Dat
a access at TL=0, errID 0x0045d2b1.ba4942ce
Oct 13 16:36:38 myserver AFSR 0x00000000.00200000<UE> AFAR 0x00000000.fe8a9d58
Oct 13 16:36:38 myserver AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0xfb800004
Oct 13 16:36:38 myserver UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0203<UE> UDBL.ESYND 0x03
Oct 13 16:36:38 myserver UDBL Syndrome 0x3 Memory Module Board 2 J3100 J3200 J3300 J3400 J3500 J3600 J3700 J3800
Oct 13 16:36:38 myserver SUNW,UltraSPARC-II: [ID 366302 kern.warning] WARNING: [AFT1] errID 0x0045d2b1.ba4942ce Syndrome 0x3
indicates that this may not be a memory module problem
Oct 13 16:36:38 myserver SUNW,UltraSPARC-II: [ID 356728 kern.info] [AFT2] errID 0x0045d2b1.ba4942ce PA=0x00000000.fe8a9d58
Oct 13 16:36:38 myserver E$tag 0x00000000.0bc01fd1 E$State: Modified E$parity 0x05
Oct 13 16:36:38 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000000.00000000
Oct 13 16:36:38 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x00000000.00000000
Oct 13 16:36:38 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x00000000.00000000
Oct 13 16:36:38 myserver SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x18): 0x00000000.10000000 *Bad* PSYND=0x0
0ff
Oct 13 16:36:38 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000
Oct 13 16:36:38 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x00000001.f7000670
Oct 13 16:36:38 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000001.00000000
Oct 13 16:36:38 myserver SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x00000001.f7027ff8
Oct 13 16:36:38 myserver unix: [ID 321153 kern.notice] NOTICE: Scheduling clearing of error on page 0x00000000.fe8a8000
Oct 13 16:36:38 myserver SUNW,UltraSPARC-II: [ID 507105 kern.info] [AFT3] errID 0x0045d2b1.ba4942ce Above Error is in User M
ode
Oct 13 16:36:38 myserver and is fatal: will reboot
Oct 13 16:36:38 myserver unix: [ID 855177 kern.warning] WARNING: [AFT1] initiating reboot due to above error in pid 28250 (j
ava)

 
Off the top of my head I would say a memory problem.

Although I did have a couple of servers that kept crashing and a patch cluster fixed the problem.
 
100MBS;

I would agree with KHZ. To expand on this issue I would say that you have multiple cpu's being called out;

cpu8 which is on board in slot 4 proc 0.

CPU4 which is on board in slot 2 proc0
cpu5 which is on board in slot 2 proc1

The errors consistantly call out memory on board in slot 2.

Memory Module Board 2 J3100 J3200 J3300 J3400 J3500 J3600 J3700 J3800

I would suggest replaceing this bank of memory.

Depending on you OS you could look into running cediag(tool to weed out bad memory), which is downloadable from suns download page. I have never run this on an E4500 so i am not sure if it will work.

Thanks

cadams
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top