Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

4948s reboot themselves

Status
Not open for further replies.

Staticfactory

IS-IT--Management
Mar 1, 2005
79
CA
We're encountering a very strange problem with Catalyst 4948-10GE switches that are rebooting themselves without cause (that we can find). Nothing appears in syslog or any other error logs but Solarwinds does see them reboot and the uptime reflects the reboot as well.

Has anyone ever had this or a similar issue? We're starting to lose our hair.
 
I would begin by checking your power source to make sure it isn't going out.
 
Thanks for the reply brianinms -- definitely a good place to start. We have a group of about 16 switches that are attached to the same UPS and (so far) it's only the 4948-10GEs that reboot themselves at seemingly random intervals. Solarwinds would also report the node as down/up in the case of power loss, but instead reports that the switches are in fact rebooting.

We have also updated the IOS to the latest version on a select few to see if they behave differently and they don't, so it does not appear to be a configuration/IOS issue (that we can deduce).
 
I would imagine that if there was something wrong with the switch causing it to reboot it would generate a trap. When one 4948 reboots do all of them reboot? Are they plugged into the same section in the UPS?
 
No, they reboot at different intervals, usually a few weeks to a few months apart. They are also plugged into different rack power strips on one massive UPS (along with other switch models that have no odd behavior).

Just doing some investigating and it appears to create a dump, but I'm not well versed in reading register/memory data in hex.

The SHOW VERSION command gives us this information:

Engineering-SW-5 uptime is 4 hours, 30 minutes
System returned to ROM by reload
System restarted at 03:53:06 EDT Fri Mar 13 2009

cisco WS-C4948-10GE (MPC8540) processor (revision 5) with 262144K bytes of memory.
Processor board ID FOX11030PG0
MPC8540 CPU at 667Mhz, Fixed Module
Last reset from Reload
 
I would open a TAC case and submit the dump to the engineer and they can decipher it.
 
Wish I could, but we don't have TAC support on these switches (or I wouldn't be bugging you guys) ;) Wasn't my idea to not have the coverage either...
 
Well without a valid SMARTnet contact you aren't entitled to software updates. :) If its a memory dump I would have to say that it is something either software or network related. Perhaps something in the network is overflowing the buffers and causing the switch to reboot.
 
We wondered if maybe it was an overflow issue but can't find anything that would lead us to believe it, or indicate the source of the problem. The only thing that these switches have in common besides being 4948s is that they are all server switches, and it's most HP servers behind them (but not all).

As for the support contract, we purchased our devices and support through a reseller (Bell) so we have access to download all of our updates but have only select devices (by S/N) on file for TAC support. I don't know exactly how it works as I was not here when the contract was established but am pushing to get ALL of our core gear under the service contract, and directly with Cisco.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top