Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Clients lose network connection at random times !!

Status
Not open for further replies.

jheyrman

IS-IT--Management
Apr 29, 2002
27
GB
We have a strange network problem on our LAN.

At certain times (randomly) our network clients seem to lose their network connection to a server, when this occurs they can't connect to one server but are still able to connect to other servers.
This results in many strange things and errors.

"Disk or network error" (MS-Access)
"Cannot save document you must select another file name" (MS-Word en MS-Excell)
"The file is in use by another user… " (MS-Excell)

We have the impression that the problem exists more when network traffic is low.
As a result of this we started pinging the servers every second from a client PC connected to the same switch as the servers. And we discovered that when this error occurs the server in question gives no replies on the ping command. (Sometimes for more than 15 seconds and this happens about 20 times a day)

We discovered now that when we open PcAnywhere sessions to the servers the problem disappears.
The server consoles need to be unlocked and the CPU monitor needs to be visible (to send more data over the PcAnywhere session).
When the sessions are open the servers can respond to the ping command at all times and the clients don't lose their network connections to the servers anymore. When we keep the servers very busy (CPU and LAN) then the clients also seem to keep their connections alive.

Does anyone know of this phenomena or knows a solution?


Here are some specs of the things we have.

Servers:
Poweredge 1300 WinNT 4.0 SP6a (with Intel pro 10/100 LAN card)
Poweredge 4200 WinNT 4.0 SP6a (with 3com 3c980 server LAN card)
Poweredge 2500 WinNT 4.0 SP6a (with Intel 8255xx-based 10/100 LAN card)

Network:
Cisco Ethernet switch model 3524
All servers are connected to this switch and they negotiate 100Mbit Full Duplex.

Things we already tested but didn't help:
We tested with other switches (baystack, 3com) and hubs (Intel)
We changed the cabling.
Forced the auto-negotiation to 100/full 100/half 10/full 10/half
Other network cards (in clients and servers)
Checked out the temp directories on clients (MS KB: Q150943)
We disabled all the power saving on clients (NT4 Server has no power saving)


Best regards
Johan
 
The problem looks like being situated in that WinNT server.
Check the event viewer? Is it something strange there? Is it a screen saver runnning on that system?
Gia Betiu
m.betiu@chello.nl
Computer Eng. CNE 4, CNE 5
 
Difficult problem....

Are there multiple NICs on this problem server?
You said it seems as though it occurs at slow times...are multiple nics possibly cycling through dormancy at different times? And are there more than one type of nic on any given machine? The differences in drivers and IRQ levels could be causing interface difficulties. When the nics wakeup are they possibly causing issues in the distribution of data from one to the other, or causing disruptions in the cpu processes that might affect data distribution to interfaces?

I am not an NT expert, but I am interested to hear of your progress. Please post again! Email me! denodave@yahoo.com
Real men pray...especially techies!
 
no there are no multiple nics on this Problem servers.
I think that the PcAnywhere keeps the connection alive because is almost constantly sends packets over the network, my question is is there some other setting or program that keeps my connection alive ?

best regards
johan
 
There has to be a simple solution to this problem.
1) Is it only one Server ?
2) R U only using one hub/switch ?
3) Replace the server in total.
4) Try another power source for the server/switch maybe power them via a UPS.
5) Have the grounding of all PC/SERVER/SWITCH power connections checked. A bad earth connection will surely cause very weird problems.

C4J
 
Servers:
Poweredge 1300 WinNT 4.0 SP6a (with Intel pro 10/100 LAN card)
Poweredge 4200 WinNT 4.0 SP6a (with 3com 3c980 server LAN card)
Poweredge 2500 WinNT 4.0 SP6a (with Intel 8255xx-based 10/100 LAN card)

Network:
Cisco Ethernet switch model 3524

We replaced the servers !! ( with new servers)

We also replaced the UPS's (we also tried to remove them)

All equipment is grounded.


 
Hmm sounds a nightmare :(

You seem to have checked the obvious with no resolution. Only other thing I could think of is maybe a rogue device on the network causing a broadcast storm or something similar which overloads the switch and causes it to hang momentarily but it seems pretty far-fetched to me. you could try running SNMP monitors on the servers though to check if there's any odd traffic/volume hitting the switch before it dies.
 
The switch selve doen't stop running.
I'ts only the server in question (1 port) that is not available anymore.

 
Ah I misread your post then.

As another long-shot (if it's just one device affected) a problem we had a while back was someone configured a workstation with the same IP address as a printer. This didn't throw up duplicate IP address warnings (I guess because the duplicate was a non-Windows device?) but whenever someone printed the user on the workstation would get temporarily disconnected from the network (network cable unplugged warning). It took us a lot of head scratching to track it down (I hate intermittent network problems)! I guess a similar thing might happen if a device was configured with the same IP as the server. Kinda clutching at straws though ;)
 
I double checked every device at this site and nobody uses a duplicate ip adress.

Best regards
Johan
 
By any chance is this particular server physically separated a great distance from the rest of the network? A long-shot could be you have violated one of the basic rules of physical-layer connectivity. Are you working within the 5-4-3-2-1 principal? Is there too much cabling between server and workstations? Are there sources of RF interference anywhere near this server's cable runs -- in particular any device that may start up intermittently? If so, it could be the signals are being killed or interfered with at the physical layer. Email me! denodave@yahoo.com
Real men pray...especially techies!
 
Since a PC Anywhere connection keeps the connection going, I think you may have a dormancy problem. Is your server going to sleep during slow network times? Check all of your server's powersaving settings. I've never seen a server by default set to turn off, but you never know.

Check the interface on the switch that the server is plugged into and see if keepalive is on or off. If it's off, try turning it on and see if it helps.

I don't think it's a duplicate address. That would cause messages to pop up and event log entries to appear as well.
 
just buy alot of pcAnywhere host sesions and your problem will be solved !!!

 
I have a compaq 8000 that started behaving the same way. At first it started with a bad carriage. All fualty hardware has been has been replaced and it is still behaving the same way. My server is running W2K server and it's the second dc on my subnet. Any ideas or have you rectified your problem?
 
Sounds like your problem is keeping that server connection alive during periods of inactivity. The ping latency on the server may be high. Check the type of cabling, it should be Cat5e <100 meters and properly crimped (e.g. brown,brown-white,green,blue-white,blue,green-white,orange,orange-white).
 
We have experienced that having pcAnywhere installed could raise similar problems.
 
I have NT server with sp6.
server is a domain and 40 user connect with server (clients).
The connection with the server to the client is LAN.
Whenever ping the client to the server showed Request timed out.Some times connectivity with the hub is unplugged without touch anything on the RJ47.Frquently it was occuring.
This is because of lose connection or anything releated with server memory or any prolem in the hub.
crimbing of the Cables is proper.
plz send the proper solution.
thanx
kannanm@7hillsys.com

 
I have NT server with sp6.
server is a domain and 40 user connect with server (clients).
The connection with the server to the client is LAN.
Whenever ping the client to the server showed Request timed out.Some times connectivity with the hub is unplugged without touch anything on the RJ47.Frquently it was occuring.
This is because of lose connection or anything releated with server memory or any prolem in the hub.
crimbing of the Cables is proper.
plz send the proper solution.
thanx
kannanm@7hillsys.com

 
We have had the similar problem. Some clients lost their connection to servers at random time.
The problem was a uplink- connection between 2 switches. The crimbing of this cable was NOT OK! ( selfmade! ) ( The wrong pins were twisted together )
We changed the cable to propper one and the problems was gone :)

best regards
Pete
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top