Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Crash on Netra T1 with no indication why? 1

Status
Not open for further replies.

philpe

IS-IT--Management
Apr 24, 2002
21
0
0
CA
Hello,

I've been having a problem with my Sun Netra T1 server for past few months now. It seems every once in a while the server will just go down for no known reason. It's compleatly dead to the world, no Lom or OK prompt, and the only what to get it back up is to power down and back up. After we do that, the server comes up just fine and I searched for any indication of what happened, but nothing in the log files anywhere. Has anyone run accross this before, or know of where I can get more info on it. Thanks for you help.
Phil

System info: Sun Netra T1,
2 Disks using Sun disksuite to mirror
System is a DNS, Mail, Web server
Disk is only about 11% full
 
Have you enabled savecore, it might be able to give you a little incite. I don’t believe panic strings are logged unless savecore is enabled (I could be wrong). But having a core file could help you troubleshoot this. Even the fact that a core file is not generated would be a helpful troubleshooting fact. Man savecore for more info.

My gut feeling says a heat problem. Maybe a fan went out? The Netras are small and airflow is poor, having a fan go out would cause it to shutdown and it may not boot again until the temperature falls again, which could only take a few minutes.

Some things to think about.

Joe
---------------------------------------
Joe Keegan - Joe@jjk3.com
SANS GSEC & GCFW
CCSE, CCNA, CCSA & Sun Certified
 
Just a shot in the dark . . . Is there any power management enabled on this server? Check /etc/rc2.d/S85power and "man pmconfig".
 
one gotcha to remember if the drive that fails is the one with the logs on it, you won't get any failure messages I tried to remain child-like, all I acheived was childish.
 
I've confirmed that savecore is started at boot, and no power management is running. Again, nothing in the logs, and no core files anywhere. If a fan fails, does it go through a shutdown, and if so, wouldn't it log something?

The drives are configured as mirror, with disksuite, so I would assume that it would still be able to log something if one went offline.

Is there any other log files I should look at? I checked the messages, and syslog, is there another one I should check out?

Thanks again for you help.

Phil
 
If that Sun Alert doesn't address it... consider the memory modules. We had previously upgraded quite a few machines, and didn't follow the correct procedures for installing them. The machine would be OK at boot-up, but the apperant situation was the heat from the machine would cause the components to shift. So, reseating, or possibly having a Sun CSE reseat your components.
 
If a fan fails you will/should get error messages in your /var/adm/messages files or you could run:
/usr/platform/sun4u/sbin/prtdiag -v

Which will show you if you have any failures in your system as well as temperatures.

If the machine is crashing with no error messages it may be worth getting a console or laptop attached via ttya and when the machine crashes it should show the crash output.

Also check sunsolve.sun.com/patchpro to make sure you are on the latest patches I would also check the obp of the machine is on the latest as well.
 
I have experianced the same problem, however many times the system will need fsck run on the file system. Cntrl-D won't work and I have had to remove the drive and place it in my Blade 100 and run FSCK on it in there. About 2 weeks ago I ran the patch updates and the problem has since vanished. The odd part was it always crashed on a Saturday :-\. But the Sun Alert ID as posted by Reaper31 "See Sun Alert ID: 41153....should fix your problem." does work. I simply ran the cluster patch found here on my version of solaris. I hope this is of some help.
 
I have to agree with worldchanger, the patch did seem to correct the problem, we've been running without problem for about 2 months now. Thanks for all your help.

 
If possible, check the power supply on the Netra. We've had several problems with the ps and get the same indicitations you are referring to. Way we determined it was a power issue was to swap them out. Put a new one in and it worked fine. Just a thought.

Gary.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top