Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

excessive router load

Status
Not open for further replies.

CCNEH

Technical User
Sep 9, 2003
47
0
0
GB
A few routers are running out of memory and hanging/running slow since a virus outbreak which is now cured.

How can I see what is causing the hangs? packet sniffer?

 
What is the cpu's running at now , if it's still high then you may have not gotten all the bad guys .
You can use Netflow on the interfaces to see who is doing what ,like one address going to many different addresses etc . This was our main troubleshooting tool during the Nachi outbreak during the summer .
 
I used the logging function on the access point, in this case it was a FW, to help pin point the badies which still had the Nachi virus on the inside LAN. This may not be an ideal solution as the routers are already under pressure, however check the CPU again and it may provide a method to mop up the straglers.
 
I wouldn't think the router should run out of memory just from throughput alone. Are these your edge routers having the problem? Maybe your taking on a lot more routes than before.
 
Baddos.. you might think not but various things can kill a router under a very heavy load. Several version of Cisco IOS has a bug that under heavy loads of small packets such as a Citrix/MetaFrame farm will die with buffer errors. You will see a very high discard rate of small buffers and after a period of time, the router stops routing.

NAT is a second error prone piece of code. Having a router with a large number of EIGRP routes in memory can kill it.

These are all examples that I have run into over the last few years. It can take alot to kill a router but it's doable.

MikeS

Find me at
"Take advantage of the enemy's unreadiness, make your way by unexpected routes, and attack unguarded spots."
Sun Tzu
 
i had the same problem. Cleaned everything up and still had router hangs and high cpu.

I installed SNORT and found about 20 machines that were not cleaned. I still find machines once in a while with it.

 
The easiest way to find the Nachi virus is to look at your NAT tables. I had one computer infected with it, and in a day or so my router was bogged down with about 40,000 ICMP NAT entries from a single address.

show ip nat stat
show ip nat trans

Anyways, I am assuming you do the "show proc mem" command? That should tell you where all the memory is being allocated to on the router. A "show proc cpu" will also give you hints. That is all assuming you are not using some sort of management software to monitor.

If it is your NAT, then simply trace the infected computers by the IP addresses. If you ever want to flush the dynamic NAT table, use the "clear ip nat trans*" command.
 
Right,

I've had our service provider check out the link its a 128k serial link and the users at that site keep losing the link and having to re-connect.

If I ping the remote router...

Reply from 192.168.13.1: bytes=32 time=16ms TTL=254
Reply from 192.168.13.1: bytes=32 time=15ms TTL=254
Reply from 192.168.13.1: bytes=32 time=31ms TTL=254
Reply from 192.168.13.1: bytes=32 time=16ms TTL=254
Reply from 192.168.13.1: bytes=32 time=16ms TTL=254
Reply from 192.168.13.1: bytes=32 time=15ms TTL=254
Reply from 192.168.13.1: bytes=32 time=16ms TTL=254
Reply from 192.168.13.1: bytes=32 time=15ms TTL=254
Reply from 192.168.13.1: bytes=32 time=31ms TTL=254
Reply from 192.168.13.1: bytes=32 time=16ms TTL=254
Reply from 192.168.13.1: bytes=32 time=16ms TTL=254
Reply from 192.168.13.1: bytes=32 time=31ms TTL=254
Reply from 192.168.13.1: bytes=32 time=203ms TTL=254
Reply from 192.168.13.1: bytes=32 time=281ms TTL=254
Reply from 192.168.13.1: bytes=32 time=343ms TTL=254
Reply from 192.168.13.1: bytes=32 time=406ms TTL=254
Reply from 192.168.13.1: bytes=32 time=265ms TTL=254
Reply from 192.168.13.1: bytes=32 time=94ms TTL=254
Reply from 192.168.13.1: bytes=32 time=16ms TTL=254
Reply from 192.168.13.1: bytes=32 time=16ms TTL=254
Reply from 192.168.13.1: bytes=32 time=15ms TTL=254
Reply from 192.168.13.1: bytes=32 time=16ms TTL=254

The longer ms times concern me but I don't know how to isolate the problem, if more than one user connects they get kicked out?

 
Use traceroute.. it works much better as you get a time for each hop. This will help you to isolate the problem.

MikeS


Find me at
"Take advantage of the enemy's unreadiness, make your way by unexpected routes, and attack unguarded spots."
Sun Tzu
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top