Cisco 3640 grinding to a halt 2

medtek · Mar 17, 2005

I work for a medical company, and we are using a cisco 3640 with ios 12.2 as our main vpn / internal nat router.
About 4-5 hours after a reboot, the router slows to a screeching halt, no ping... but can telnet. the delay when entering telnet commands is 1-3 seconds. I have tried the sh ip cache command to check and see if we are being flooded internally, but can't seem to find anything abnormal. Is there a way to log internal traffic? we have 2 subnets.. 172.20.1.x and 192.9.200.x
any help is appreciated.
thanks,
kent

JOAMON · Mar 17, 2005

Check to make sure your default route to the internet is directed to the next hop router IP address and not out an interface. Could be if directed out an interface that the arp list grows so large that it becomes a problem.

Do not use ip route 0.0.0.0 0.0.0.0 interface ethernet0/1
use instead ip route 0.0.0.0 0.0.0.0 65.200.3.4 (next hop)

Do a show arp and you should only see internal addresses along with the external addresses that are associated with the ip routeable subnet on the outside interface.

JOAMON · Mar 17, 2005

Also take a look at your CPU usage

Do a show process cpu history and see if your cpu is hitting 100%

Depending on the number of users and vpn connections may need to consider upgrade to 3700 series.

I have also seen when syslog was set to the router itself instead of an external syslog device which when the log reaches a certain size caused performance problems as well.

medtek · Mar 17, 2005

i did that... thanks. this is what it came up with..

111111111 11111 1111 1111
9990000000009999999990000099999000099999999999999999990000
9990000000009999999990000099999000099999999999999999990000
100 **********************************************************
90 **********************************************************
80 **********************************************************
70 **********************************************************
60 **********************************************************
50 **********************************************************
40 **********************************************************
30 **********************************************************
20 **********************************************************
10 **********************************************************
0....5....1....1....2....2....3....3....4....4....5....5....
0 5 0 5 0 5 0 5 0 5
CPU% per second (last 60 seconds)
1111111111111111111111111 11111
000000000000000000000000090000099999888777766558
000000000000000000000000090000098932832774261968
100 #########################*********
90 ################################*#*** *
80 ###################################****** *
70 #######################################***** *
60 ###########################################*****
50 ##############################################**
40 ###############################################*
30 ################################################
20 ################################################
10 ################################################
0....5....1....1....2....2....3....3....4....4....5....5....
0 5 0 5 0 5 0 5 0 5
CPU% per minute (last 60 minutes)
* = maximum CPU% # = average CPU%

100
90
80
70
60
50
40
30
20
10
0....5....1....1....2....2....3....3....4....4....5....5....6....6....7.
0 5 0 5 0 5 0 5 0 5 0 5 0
CPU% per hour (last 72 hours)
* = maximum CPU% # = average CPU%

JOAMON · Mar 17, 2005

Wow.....

Last 60 minutes shows huge CPU usage.

What did the show arp report?

what does your default route to the internet look like?

medtek · Mar 17, 2005

IrvingVPN#sh arp
Protocol Address Age (min) Hardware Addr Type Interface
Internet 192.9.200.38 - 00b0.642f.1b91 ARPA Ethernet1/0
Internet 172.20.1.254 1 0002.b3a4.7940 ARPA Ethernet1/1
Internet 172.20.1.63 1 0004.0092.7c04 ARPA Ethernet1/1
Internet 172.20.1.38 - 00b0.642f.1b92 ARPA Ethernet1/1
Internet 192.9.200.220 1 0002.a553.3cb9 ARPA Ethernet1/0
Internet 172.20.1.89 1 0001.2934.3115 ARPA Ethernet1/1
Internet 192.9.200.250 0 0006.5b0e.4ed3 ARPA Ethernet1/0

medtek · Mar 18, 2005

If I reboot it, it runs great for about an hour or so... then it bogs down. I don't think it has to do with routing, since it even lags from the inside of our network. I don't have a console cable to hook up to it.

medtek · Mar 18, 2005

I have monitored this router over the course of the evening, and can find nothing wrong with it other than the increased cpu usage. Any ideas on what could be doing this?

tecnikall · Mar 18, 2005

Hello there, can you capture sh proc cpu and show mem sum and I will have a look at it. High cpu can be caused by interrupts , a proccess or running out of main memory. Also, a DOS attack can cause high cpu too. Cheers.

JOAMON · Mar 18, 2005

Has anything recently changed on your network or with services on your router? Anything new like voice on the network, firewalling, QOS, new server, things like that. Also you mentioned that you have two internal subnets and the router also supports VPN connections. About how many internal users are there and how many VPN's and the number of users on each VPN?
Post the info that tecnikall ask about. The show cpu will break it down by process.
If might be quite possible that you have reached a point where you have outgrown your router.

vipergg · Mar 18, 2005

You may have someone who is infected , and if it is trying to send out to unknown addresses every single packet is going to hit the cpu , a 3600 should have no problem handling 2 subnets under normal circumstances. I would tell you to use netflow to see who is causing all the traffic but seeing your cpu is so high this might drive it over the edge. If you have manageable switches below the router you may be able to track it that way , clear the counters and look for anyone who is transmitting large amounts of unicasts or broadcast traffic , you may have to start pulling users one at a time if you don't have a network analyzer you can use .

rtfmdude · Mar 18, 2005

I remember when I worked on the TAC I had a customer who had a few slammer(or something equally bad) hosts on the inside. He was doing NAT on the box, and after an hour or so, the 2500(or whatever, it was a low-end box) would run out of memory/nat translations or something - clear ip nat trans would take care of it temporarily. You might want to clear out the nat translation table, then keep doing a 'show ip nat trans' and monitor how many you've got. I think one clue was that the outside addresses were all consecutive addresses, and slammer pings consecutive ip addresses until it gets a response.

BuckWeet · Mar 18, 2005

i think rtfmdude probably hit the nail on the head!

medtek · Mar 18, 2005

sh ip nat trans yeilds a certain IP on our network 172.20.1.89 (which is one of our isps routers) trying to hit a ton of different addresses. here is just a little snippet. there are pages and pages of this

tcp 216.201.222.115:2353 172.20.1.89:2353 203.61.52.156:135 203.61.52.156:135
tcp 216.201.222.115:11225 172.20.1.89:1974 203.172.81.133:135 203.172.81.133:135
tcp 216.201.222.115:3559 172.20.1.89:3559 203.93.175.198:135 203.93.175.198:135
tcp 216.201.222.115:2530 172.20.1.89:2530 203.152.27.144:135 203.152.27.144:135
tcp 216.201.222.115:1226 172.20.1.89:1226 203.61.129.4:135 203.61.129.4:135
tcp 216.201.222.115:4419 172.20.1.89:4419 203.69.92.131:135 203.69.92.131:135
tcp 216.201.222.115:2984 172.20.1.89:2984 203.61.106.38:135 203.61.106.38:135
tcp 216.201.222.115:2594 172.20.1.89:2594 203.93.147.140:135 203.93.147.140:135
tcp 216.201.222.115:8971 172.20.1.89:1431 203.189.226.67:135 203.189.226.67:135
tcp 216.201.222.115:3217 172.20.1.89:3217 203.129.192.249:135 203.129.192.249:13
5

rtfmdude · Mar 18, 2005

Well, you've got an awful lot of tcp port 135 translations in there....and as we all know, tcp port 135 = bad

I think this might be symptomatic of Nachi or something, can you pull that box(172.20.1.89) off the network and virus scan it?

medtek · Mar 18, 2005

i did a nbtstat -a 172.20.1.89
and it is pointing to a local machine... not a router. that is strange. i'm going to go set that box on fire.
thanks for all of your help, you have been great!

JOAMON · Mar 19, 2005

What did you find on the firebox???????

medtek · Mar 20, 2005

i found several boxes infected with the linkbot.h worm variant. so i cleaned and patched them, now my network is running like a dream.
again, thanks a ton for all your help.
i've been using the
sh ip nat trans
and
nbtstat -a (infected ip)
to locate the infected machines and clean them.

rtfmdude · Mar 21, 2005

cool - so nbtstat has a use, after all! (j/k)

JOAMON · Mar 21, 2005

Do you have A/V on all your network PC's? Our company has been using Symantec corporate edition for 2 years now and have had excellent results. Easy to deploy on network machines and all machines can be managed and settings locked from the A/V server.

Glad you found the problem.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Cisco 3640 grinding to a halt 2

IS-IT--Management

IS-IT--Management

IS-IT--Management

IS-IT--Management

IS-IT--Management

IS-IT--Management

IS-IT--Management

IS-IT--Management

Technical User

IS-IT--Management

MIS

Programmer

IS-IT--Management

IS-IT--Management

Programmer

IS-IT--Management

IS-IT--Management

IS-IT--Management

Programmer

IS-IT--Management

Similar threads

Log in

Part and Inventory Search

Sponsor