Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Entire Wireless network died...TWICE!!!

Status
Not open for further replies.

makemorebeer

Technical User
Jun 6, 2007
96
US
ok so on tuesday the entire wireless network crashed. this includes every wireless device we have (2 cisco 1400 bridges, 30 cisco 1200 series IOS AP's, and a WLSE). By the time we found out, it was to late to catch much information but i can tell you that I was unable to get into the consoles on the few that i could reach. they all came back up after rebooting too, but syslog, and device logging found nothing. then in the middle of the night on the 6th -> 7th it happened again. I got a better shot at it this time. it looked like 3 ap's attached to a single switch went, followed shortly after by four AP's attached to another switch. and then within an hour the entire wireless network was done and I was on a warehouse roof at 3am rebooting bridges. Odd thing is that nothing else was having issues. it was only the wireless devices. I can't think of a single thing that might do that. Anyone got any ideas what the heck is going on?
 
All that comes to mind is some sort of powerful RF interference in or around the 2.4GHz frequency band...

ROGER - G0AOZ.
 
I was worried that's what someone would say but it took out our main office and our remote site which is 11 miles away. is interference like that even possible, not to mention interference does not explain why I can't console the devices does it?
 
Taking out two sites 11 miles apart simultaneously, doesn't sound like RF interference unless someone is deliberately targetting your organisation. This is probably very unlikely...

Did you have any bad electrical storms on those occasions?

I'm afraid I don't have any other ideas...

ROGER - G0AOZ.
 
well it's the middle of winter here in Wisconsin so electrical storms are out for at least another two months. i didn't think RF interference could be an issue what with the distance between the two. I too doubt anyone is targetting us for anything. Thanks for taking a look though.
 
makemorebeer

You might want to consult your user manuals and check if your wireless systems have either upstream or down stream communications.

In other words they are in constant communication with each other and if one or more drop then the others back off the network its kind of a security (pain) issue.

I have them and if one goes the others shut down within a certain time period - a reboot of devices gets them bck to communicating with each other.

good luck tracking it.

 
that was an idea we came up with on friday that maybe there is some security featuer we are unaware of. what kind of AP's are you using that do this. do you know if cisco 1200's do this?
 
makemorebeer

Sweet handle,

Its not really a security issue as a communication issue - they send beacons to each other asking "Hey you there?" and then recieve a reply -
Image you have a main AP and another one downstream, they are bridged and configured to communicate with each other.
The main AP sends out its beacon "hey you there?" but recieves no reply. The main AP now realizes that there is a problem and tries to find another way to send traffic ( hopefully been program to do so). If there is no second path the systems start dropping all communications- and a chain reaction starts. If there is a second pathway the Main AP will switch and send all traffic to the other AP - but since you lost all communication to 3 APs there was no path.

Yes Cisco 1200's are intellegent and have this feature, most AP do when set up in bridge mode, its the way they communicate to each other.

Your best bet is check the switch it might have failed - and see if you can seperate the 3 APs off a single point of falure.

bob
 
thanks, it's because i work for brewery =) all that golden goodness.

I like your explanation but it osunds almost like it may depend a little on the radios being configured in repeater mode. each radio is hardwired to the switches they are connected to. the switch couldn't have failed because the computers in the office were still working normally. it's literally only the wireless that dies. isolation of the 3 AP's i don't think would matter because it happened across the entire network, jumping from segment to segment which should have no bearing on eachother, especially when you consider the AP's at our remote site dropped off too. most of my AP's are running the same configuration (except the WDS AP's) here's one of my configs if that might shed a bit of light.

I should also note this has not recurred in the last week, and the only change we've made so far was to take our network monitor offline as it was a suspect. the network monitor is the dude server from microtik. one of our concerns was that its discovery method might be knocking these off but we have no way of proving whether this is happening or not.


version 12.3
no service pad
service timestamps debug datetime msec
service timestamps log datetime msec
service password-encryption
!
hostname AP010
!
enable secret 5 ::
!
username :: privilege 15 password 7 ::
clock timezone Central -6
clock summer-time summer-central recurring 2 Sun Mar 2:00 1 Sun Nov 2:00
ip subnet-zero
ip domain name citybrewery.com
ip name-server 10.1.x.x
ip name-server 10.1.x.x
ip name-server 10.1.x.x
!
aaa new-model
!
!
aaa group server radius rad_mac
!
aaa group server radius rad_acct
!
aaa group server radius rad_admin
!
aaa group server tacacs+ tac_admin
!
aaa group server radius rad_pmip
!
aaa group server radius rad_eap
!
aaa group server radius dummy
!
aaa authentication login default local
aaa authentication login eap_methods group rad_eap
aaa authentication login mac_methods local
aaa authentication login method_admin group admin
aaa authentication login method_radius group radius
aaa authorization exec default local
aaa accounting network acct_methods start-stop group rad_acct
aaa session-id common
dot11 network-map
!
!
bridge irb
!
!
interface Dot11Radio0
no ip address
no ip route-cache
!
encryption key 1 size 128bit 7 :: transmit-key
encryption key 2 size 128bit 7 ::
encryption key 3 size 128bit 7 ::
encryption key 4 size 128bit 7 ::
encryption mode wep mandatory
!
ssid **
authentication network-eap eap_methods
!
ssid **
authentication open
infrastructure-ssid optional
!
short-slot-time
speed basic-2.0 5.5 6.0 9.0 11.0 12.0 18.0 24.0 36.0 48.0 54.0
power local cck 30
power local ofdm 30
power client 30
channel 2412
station-role root
antenna receive right
antenna transmit right
beacon period 96
dot11 extension power native
no cdp enable
bridge-group 1
bridge-group 1 subscriber-loop-control
bridge-group 1 block-unknown-source
no bridge-group 1 source-learning
no bridge-group 1 unicast-flooding
bridge-group 1 spanning-disabled
!
interface FastEthernet0
no ip address
no ip route-cache
duplex auto
speed auto
bridge-group 1
no bridge-group 1 source-learning
bridge-group 1 spanning-disabled
!
interface BVI1
ip address 10.1.210.x 255.255.0.0
no ip route-cache
ntp broadcast client
!
ip default-gateway 10.1.x.x
ip http server
ip http authentication aaa
no ip http secure-server
ip http help-path ip radius source-interface BVI1
logging trap debugging
logging snmp-trap emergencies
logging snmp-trap alerts
logging snmp-trap critical
logging snmp-trap errors
logging snmp-trap warnings
logging 10.1.x.x
logging 10.1.x.x
snmp-server community * RW
snmp-server community * RO
snmp-server enable traps tty
bridge 1 route ip
!
!
wlccp ap username :: password 7 **
!
line con 0
transport preferred all
transport output all
line vty 0 4
transport preferred all
transport input all
transport output all
line vty 5 15
transport preferred all
transport input all
transport output all
!
ntp clock-period 2860625
ntp server 10.1.x.x prefer
ntp server 10.1.x.x prefer
end
 
did you get anywhere with this?

I had some cisco 1200 ap's a couple years back that would not work well and an ios upgrade did the trick.
the ap's were "wds masters"...my ios wireless terminology is rusty to this may not be the right term, but the ap's were going offline and would not recover but with a hard reboot, you may want to think about some kind of POE so you don't have to go all the way to the roof to reboot those suckers
 
Good call on the rooftop thing. we've actually drawn up a plan to remedy that. the specific one on the roof is a 1400 series bridge, and it has a power injector that is not POE capable. we are running POE on all the 1200's though. point in fact I have not been able to get anywhere on this issue. I've got a guy on the microtik forum stating that the Dude network monitor might create an issue when monitoring telnet. I've been running in a lab environment for nearly two weeks now and i can't recreate the problem. I wonder what the latest firmware is for those things. come to think of it I'm not even entirely sure what I'm running on them right now. i can tell you that this happened on more than just the WDS's. i can deffinatly look into a firmware upgrade on these things though. i like to keep my equipment up to date.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top