Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

E250 crashes and loses ip address, Help?

Status
Not open for further replies.

news2me

Technical User
May 14, 2001
28
0
0
US
I have an E250 that runs sol 2.7 w/dual 400MHz with 1GB Ram. I have run into a problem that the box has crashed twice (unprovoked) in the last month. When this happens, the NIC loses the ip address, mask and gateway. Are there any error files or logs that I can look into? This is a production machine running Hp Openview and other monitoring software.

Thank you for any help provided
 
First check the /var/adm/messages file for any panic or watchdog or reboot messages. These could give you some info.

As for the interface look at the messages file during the time the system rebooted for any info concerning your newtwork port.

Do you have to manually plumb the interface? You should have an /etc/hosts file, /etc/hostname.xxx file with the hostname, example : more /etc/hostname.hme0
thishostname
 
I just looked into the messages.0 and found a lot of info but don't understand the errors. One thing I do see is that the problem starts with a cpu0 error and then I never see cpu0 re-initailize plus at the very end of the log its states "I have pasted the message below:


Dec 8 16:38:08 eliot unix: WARNING: [AFT1] EDP event on CPU0 Instruction access at TL=0, errID 0x000f726e.daa3dec6
Dec 8 16:38:08 eliot AFSR 0x00000000.00400001<EDP> AFAR 0x00000000.67ffedf8
Dec 8 16:38:08 eliot AFSR.PSYND 0x0001(Score 95) AFSR.ETS 0x00 Fault_PC 0xff01ede0
Dec 8 16:38:08 eliot UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00
Dec 8 16:38:08 eliot unix: [AFT2] errID 0x000f726e.daa3dec6 PA=0x00000000.67ffedf8
Dec 8 16:38:08 eliot E$tag 0x00000000.0e400cff E$State: Shared E$parity 0x07
Dec 8 16:38:08 eliot unix: [AFT2] E$Data (0x38): 0xd0066014.81c7e088 *Bad* PSYND=0x0001
Dec 8 16:38:08 eliot unix: [AFT2] errID 0x000f726e.daa3dec6 AFAR was derived from E$Tag
Dec 8 16:38:08 eliot unix: NOTICE: Scheduling clearing of error on page 0x00000000.67ffe000
Dec 8 16:38:11 eliot unix: NOTICE: Previously reported error on page 0x00000000.67ffe000 cleared
Dec 8 16:38:11 eliot unix: [AFT3] errID 0x000f726e.daa3dec6 Above Error is in User Mode
Dec 8 16:38:11 eliot and is fatal: will reboot
Dec 8 16:38:11 eliot unix: WARNING: [AFT1] initiating reboot due to above error in pid 20915 (sdtperfmeter)
Dec 8 16:38:18 eliot syslogd: going down on signal 15
Dec 8 16:38:21 eliot /usr/dt/bin/ttsession[20906]: child (24101) exited due to signal 1
Dec 8 16:39:36 eliot unix: syncing file systems...
Dec 8 16:39:37 eliot unix: done
Dec 8 16:41:16 eliot unix: ^MSunOS Release 5.7 Version Generic_106541-16 64-bit [UNIX(R) System V Release 4.0]
Dec 8 16:41:16 eliot unix: Copyright (c) 1983-1999, Sun Microsystems, Inc.
Dec 8 16:41:16 eliot unix: Ethernet address = 8:0:20:d8:ed:15
Dec 8 16:41:16 eliot unix: mem = 1048576K (0x40000000)
Dec 8 16:41:16 eliot unix: avail mem = 1029095424
Dec 8 16:41:16 eliot unix: root nexus = Sun (TM) Enterprise 250 (2 X UltraSPARC-II 400MHz)

______I TOOK OUT SOME UNINTERESTING INFO HERE__________

Dec 8 16:41:25 eliot unix: PCI-device: TSI,gfxp@2, gfxp0
Dec 8 16:41:25 eliot unix: gfxp0 is /pci@1f,4000/TSI,gfxp@2
Dec 8 16:41:25 eliot unix: TSI: gfxp0 is GFX8P @ 1152x900
Dec 8 16:41:25 eliot unix: stdout is </pci@1f,4000/TSI,gfxp@2> major <153> minor <0>
Dec 8 16:41:25 eliot unix: cpu0: SUNW,UltraSPARC-II (upaid 0 impl 0x11 ver 0xa0 clock 400 MHz)
Dec 8 16:41:25 eliot unix: cpu1: SUNW,UltraSPARC-II (upaid 1 impl 0x11 ver 0xa0 clock 400 MHz)
Dec 8 16:41:25 eliot unix: cpu 1 initialization complete - online
Dec 8 16:41:25 eliot unix: se0 at ebus0: offset 14,400000
Dec 8 16:41:25 eliot unix: se0 is /pci@1f,4000/ebus@1/se@14,400000
Dec 8 16:41:25 eliot unix: se1 at ebus0: offset 14,200000
Dec 8 16:41:25 eliot unix: se1 is /pci@1f,4000/ebus@1/se@14,200000
Dec 8 16:41:26 eliot unix: SUNW,hme0: CheerIO 2.0 (Rev Id = c1) Found
Dec 8 16:41:26 eliot unix: PCI-device: network@1,1, hme0
Dec 8 16:41:26 eliot unix: hme0 is /pci@1f,4000/network@1,1
Dec 8 16:41:29 eliot unix: SUNW,hme0: Using Internal Transceiver
Dec 8 16:41:29 eliot unix: SUNW,hme0: 100 Mbps full-duplex Link Up
Dec 8 16:41:52 eliot unix: dump on /dev/dsk/c0t0d0s4 size 1026 MB
Dec 8 16:42:01 eliot named[141]: starting. in.named BIND 8.1.2 Thu Mar 22 20:49:48 PST 2001
Dec 8 16:42:01 eliot Generic 107018-03-5.7-September 1998

Dec 8 16:42:03 eliot unix: pseudo-device: pm0
Dec 8 16:42:03 eliot unix: pm0 is /pseudo/pm@0
Dec 8 16:42:04 eliot unix: pseudo-device: devinfo0
Dec 8 16:42:04 eliot unix: devinfo0 is /pseudo/devinfo@0
Dec 8 16:42:05 eliot unix: pseudo-device: vol0
Dec 8 16:42:05 eliot unix: vol0 is /pseudo/vol@0
Dec 8 16:43:29 eliot snmpdx: unable to get my IP address: gethostbyname(eliot) failed [h_errno: host not found(1)]
 
Looks like you have a problem with CPU0. I have had problems with some 400 mhz cpu.
You can remove it to see of the problem still occurs or call SUN to run a diag on it. Too bad I.T. is not cash business

Luc Foata
Unix sysadmin, Oracle DBA
 
If you don't have a service contract then it is possible that the system is under warranty.
CPU module 0 is bad
For warranty info:
1. get your system S/N
2. call 1 800 usa-4sun
3. Select the warranty option
If under warranty Sun will let you know.

 
Hey tuka (or anyone else)! Stupid Newbie Question: where do you get the system serial number? What command or where is it found?
Thanks!
 
Hi
System Serial No. is written on the back of the system(where
Bar code entry is there)
 
Is there a way to tell what processes are running on each cpu? If cpu0 is faulty than I want to see if it is actually working. Is there a way to shut off cpu0 without bringing down my entire box? I just added a cpu about 3 months ago but am unsure if it was in cpu0 or 1. What position is used for a single processor box?

Thank you again for all your responses
 
psrinfo -v (will show the status of all the processors)

psradm -f 1 (will turn off processor 1)

psradm -a 1 (will turn on processor 1)

do a man on psradm and psrinfo for more info.
 
For viewing which pid are using what processor, I recommend downloading/installing proctool. go to google and search for proctool. You will find a site for solaris freeware.

If you have a single processor, it can be installed in either slot.
With two processors you can psradm -f one of the prcoessor.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top