Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Random and bizarre connectivity issues

Status
Not open for further replies.

Boxer77

Technical User
Jul 10, 2012
40
US
I've been banging my head about this for awhile. I work for a school corporation with multiple locations. Every school is working fine except for one.

The basic setup is this:
All switches are new Procurve 2910s which connect to a Procurve 3500 (IP address 192.168.6.250). From there it goes through an AT&T circuit back to the high school where it connects through another Procurve 3500 (IP: 192.168.6.251). The default gateway on the workstations is 192.168.6.250.

The problem we are having is random workstations cannot get out to the internet. They can ping 6.250 but not 6.251. They can also ping around the local network at the school. Just not outside of the school. A computer will not be able to connect for awhile then all of a sudden it can get back on the internet and browse with no issues. It happens in labs where one computer works fine but the other one does not. I can swap cables and ports but the computer that can't connect still can't connect and vice versa. But it's too wide spread of a problem for it to be something with the workstations because it happens every where in the building and to some network printers. Sometimes it will start working after a few minutes. Other times it takes a reboot. Sometimes it's down for hours. I can run a ping -t 192.168.6.251 and get a constant request timed out for awhile before it starts getting replies. Then it has no problems pinging it and internet speeds are great.

So far we've swapped out the core switch (192.168.6.250), rebooted everything repeatedly on both sides of the network (school and high school), checked everything on the server, compared switch configurations with other schools that are working, reconfigured the way switches are connected. Nothing has worked.

Any ideas? I've been working with a network engineer and we are both at a complete loss. I also posted this on the HP forum.
 
How does it work?

The LAN has a DHCP server handing out IP addresses from a scope within 192.168.6.250?
(Check network to see if there is a 2nd DHCP server on the network; compare different PCs with an "ipconfig /all" to see if they are acquiring the same details; configure DHCP snooping on the network)

So the PCs have a default GW of 192.168.6.250, which is the 3500 "core" switch on their site, and which has IP routing enabled?

How do they get to the internet? Through a proxy server or directly? Either way, they need to go off-subnet, so they send their packets addressed to the hardware address of the 3500, which is their default GW.

How does the 3500 get to the proxy/internet? It presumably has a default route, pointing at the next hop in its path to the internet. What is its next hop to the internet? A layer3 switch on a different site.
How does it route to that site? via a remote interface configured with an IP address in the same subnet as the local network hosts? Surely not? Who designed that?
 
Is the AT&T circuit a MetroE circuit? Are you doing layer 2 or layer 3 routing over this circuit or is AT&T doing anything for you on this circuit themselves?
 
I suppose it must be a layer2 service of some sort.

Either way, he should not have the remote site host subnet configured on the HW site router. He will have asymmetric routing going on, which will probably usually work and sometimes cause very hard-to-solve issues.

It's shame your network engineer didn't point this out to you.

How does the HQ router get to the remote site hosts? Not through the remote site router address.

Additionally, your description of intermittent connectivity sounds like what you get when you fail to add your remote site subnet to your AD Sites and Services - they have no idea where to go to authenticate (with their proxy for example) and can sometimes run through all sorts of weird connections waiting for timeouts before finding one that responds.
 
Depends, he never stated how his connections are defined in regards to L2 or L3. If the circuit is MetroE, then yes the standard handoff from AT&T will be L2 and will up to the customer on what to do with it. If it's some other form of circuit, then they could be doing some form of GRE or VPN tunnel across which could be failing. Would be nice to have some configs to see how things are set on the switches to rule them in or out.
 
Thanks for the replies, everyone. I talked to AT&T and they said we the number of MAC addresses being broadcast between the high school and this school are exceeding the number we are licensed for. The network engineer is working with them to try to resolve the issue and figure out why this a problem at just this one location. My guess is that the 3500 is functioning more as a switch than as a router.

For those that asked, we have a DHCP server that is also the DC for this location. All locations have a separate forest so there is no AD connection between locations. In January I am going to start bringing them into 1 domain instead of 12 separate ones.

(And it turns out after just talking to him that the other schools are close to hitting their limit of MAC addresses that can be connected.)

Thanks again.
 
MAC addresses are exceeding the number licensed for? That's a new one on me.
 
Lol. Good one.

Basically, get the remote subnet off the HQ router. It's bad design. You've got asymmetric routing and unnecessary broadcasts filling up your WAN link.

Create a new subnet whose sole purpose is to join the sites.

Then AT&T will only see two MAC addresses, which is as it should be and probably explains why they've failed to predict it would break.
 
Cajun - if he had a tunnel between sites (which he probably should, it being a school network and therefore subject to various government laws regarding confidentiality and data security) the 192.168.6.251 address on the wrong site would be even more of a nightmare. How would his head end routing look? Ugh.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top