Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Passport CPU at 100%

Status
Not open for further replies.
Feb 1, 2001
116
0
0
US
We have several Passport 1150's and 8600's. As we monitor their CPU usage with Solarwinds, we see a pattern that the CPU jumps to 100% throughout the day, yet these devices barely have any traffic going through them at all, let alone during these times.

Is this just bad data from the MIB, or do you think it's really happening? I've stood in front of our 8600 and seen the CPU light's actually go up to the 100% mark at times.

Thoughts?
 
I've noticed the same thing with MRTG and our front lights, I've always assumed it was some housekeeping function that runs periodically. In any case as I understand it the distributed forwarding architecture of the 8600 ASICs means that the main CPU doesn’t have anything to do with individual packet forwarding.
 
MULTICAST and BROADCAST storms wreak havoc on the 8600's CPU. Ours is generally caused by OSPF which uses multicast addressing to propagate information. Something in the OSPF process on these boxes isn't quite right and once it goes totally ballistic on you a re-boot is the only cure. I have worked the 8600 for 5 years now and never been able to get rid of this nagging issue.

If you're using OSPF...
 
We are and how did you come to the conclusion that this was caused by OSPF?
 
I use OSPF extensively for both WAN backup paths and LAN backbone links, about 150 subnets on 20 8600s and a dozen Cisco WAN routers. We've never had issues, is there anything unusual about your network or its design?
 
It seems that disabling a static route is the most common cause. Sometimes, within a few moments of setting a static to inactive, the CPU will go 100%, and all connectivity is hosed. Can't reach the passport in any manner, even the console port. The OSPF neighbor states on connected routers dissapears, and a re-boot is the only way to clear it up.

Anyway, just something for you to think about because, despite the flawless 20 8600's above, this issue does exist and will cripple your network should it happen to you.

Also, as I said above, multicast will kill the 8600 quicker than you might think. We had a PC person pushing an image to 15 PC's using multicast and it put the 8600 in the dirt, CPU 100%, no connectivity, done, gonzo, kaput. When we found him and stopped the transfer everything returned to normal.
 
Have not seen that kind of cpu usage with Solarwinds.
If you check the individual ports what do they indicate?

Rick Harris
SC Dept of Motor Vehicles
Network Operations
 
Nettekkie, have you talked to your Nortel reps about the issues you've seen? I bet they'd be interested in what factors are contributing to the problems you're seeing; we've had good luck getting their developer's attention and support.

I'd be interested in more details on your multicast issues as well. We've tested Ghost's multicast features without issues, but the PC guys never implemented it... if they ever do I'd like to understand what issues you had so I know what to watch for.

 
Are you using IP RIP and/or IPX RIP? If so you will see spikes when the RIP updates occur, especially if you have a large number of VLANs.

We had similar spikes but after moving to 3.7 code, the spikes have been reduced.
 
Another thing that can bring the cpu load to 100 is of course a spanning tree update

-Cheers, Niklas
 
Natchi and Blaster etc crippled ours a while back, just 3 pc's were flooding the newtwork with ICMP to non existent addresses. Check to see how much of this traffic you are handling.

Stu..

Only the truly stupid believe they know everything.
Stu.. 2004
 

There is a known bug in Passport code that exists at least up to 3.2.2.2 code...(the Accelar's have had that problem in their code as well).

Something as simple as "show config"...or "show log file tail" can wreak havoc. I've had the box hit 100% util at 3:00 AM when there is no traffic just by showing the config.

Upgrade to 3.5.5.0 code, but bear in mind...the PP8600 requires 256 Meg of RAM to run anything beyond 3.3.x.x code. There is newer code, 3.7.x.x...but I haven't done much testing on that and there are a bunch of new features, hardware supported, etc. Best to stay with 3.5.x.x unless someone else tells you otherwise.
 
Our pair of 8600's when installed new 1.5 years ago ran 50% right away. The Nortel guys said based on the network load it should not. They flew some expert dude in and in 15 mins he found it. It was a NEtwork attacked storage device that was attached redundant to both 8600's and the 8600's did not have MLT enabled. Pulled the plug on 1 uplink and the CPU went to 1%. Configured MLT and it worked fine. I always catch our server admins plugging both NIC interfaces from their new Dell servers into 2 ports on the 470's in the racks and not telling me...
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top