Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Insolvable network problems

Status
Not open for further replies.

johanneke69

IS-IT--Management
Mar 1, 2001
14
0
0
US
We have a strange network problem on our LAN.

At certain random times our network clients seem to lose their network connection to a server, when this occurs they can't

connect to one server but are still able to connect to other servers.
This results in many strange things and errors.

"Disk or network error" (MS-Access)
"Cannot save document you must select another file name" (MS-Word en MS-Excell)
"The file is in use by another user… " (MS-Excell)

We have the impression that the problem exists more when network traffic is low.
As a result of this we started pinging the servers every second from a client PC connected to the same switch as the servers.

And we discovered that when this error occurs the server in question gives no replies on the ping command. (Sometimes for

more than 15 seconds and this happens about 20 times a day)

We discovered now that when we open PcAnywhere sessions to the servers the problem disappears.
The server consoles need to be unlocked and the CPU monitor needs to be visible (to send more data over the PcAnywhere

session).
When the sessions are open the servers can respond to the ping command at all times and the clients don't lose their network

connections to the servers anymore. When we keep the servers very busy (CPU and LAN) then the clients also seem to keep their

connections alive.

Does anyone know of this phenomena or knows a solution?


Here are some specs of the things we have.

Servers:
Poweredge 1300 WinNT 4.0 SP6a (with Intel pro 10/100 LAN card)
Poweredge 2500 WinNT 4.0 SP6a (with Intel 8255xx-based 10/100 LAN card)


Network:
Cisco Ethernet switch model 3524
All servers are connected to this switch and they negotiate 100Mbit Full Duplex.



Things we already tested but didn't help:

We tested with other switches (baystack, 3com, Cisco) and hubs (Intel,3Com)
We changed the cabling.
Forced the auto-negotiation to 100/full 100/half 10/full 10/half
Other network cards (in clients and servers)
Other network card Drivers (in clients and servers)
Checked out the temp directories on clients (MS KB: Q150943)
We disabled all the power saving on clients (NT4 Server has no power saving)

Best regards
Johan

 
I'm sorry Wirk your solution is totaly out of the question.
Anyway thanks for the response.

Best regards
Johan
 
HI.

What is your Internet gateway device (firewall)?
How is it configurred?
I'm asking because your issue might be related to false PROXY ARP configuration at the firewall - then the firewall responds to ARP on behalf of the server which can cause such problems.

Try the following command on workstation when you you can ping the server and when you cannot:
arp -a
OR:
arp -a | find "x.x.x.x"
Where x.x.x.x is the IP of the server.

How many workstations?
What OS on workstations?
What other network devices you have?
Any WAN links?



Yizhar Hurwitz
 
we have many workstations +/- 250
OS on workstations are windows 3.11/95/98/2000/XP
Wen have many nework printer and wireless Vehicle mouted Computers.
There is a Wan link => Cisco 2600

About the test with the arp command, i tried the arp command when i could ping the server and i tried it when i can't ping the server... Same result.

Best regards
Johan



 
Are you getting any replication errors? Also, how about DNS or DHCP errors?
 
Just a thought. Is there powersaver applications on the NICs?

Jason

Rich Cook -- "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning."
 
no replication/dns errors.
The power management on the Nic's has been disabled.

best regards
Johan

 
Do not discount that it may be a virus (or virus scanning software). This bit me and NOTHING seemed to indicate that this is what it could be.

Also...check for Spyware (use Ad-ware from LavaSoft etc.).
In some cases I had to use "HiJack" this to remove some.

Good Luck,

Michael42

Thanks,

Michael42
 
Honestly, this sounds like a DHCP error or a DNS/WINS issue.

Have you tried using your HOSTS/LMHOSTS files?
 
Your not running CPU intensive 3d graphical screen savers on your servers are you?


Remember: Backups save jobs!!!
;-)
 
search66: the problem occurs also on 2 machines with a manual configure ip-adress, even when i ping them on the ip adress (no wins or dns needed)

Techlad: No screensavers are running on the servers.

best regards
Johan
 
When you can't ping the server from the workstations, can you ping the workstations from the server? Can you ping each other workstation?

Also, in your switch, is there a connection (or more than one) that seems to be having excessive activity? (lights blinking so fast that they are almost solid) We had a problem like that once in which a workstation was mass-mailing/flooding randomly throughout the day, during which times the network acted REALLY strange. Unplugging that workstation stopped the problem until we could remove the program that was causing the problem.
 
Johan

Still no solutions hey. Not for lack of trying from this very enthusiastic community I must say.

Well here's my 2 cents worth.

It sounds very much like a software issue.
I would connect the problem server with a couple of clients on an isolated hub and observe what happens. The solution may lie as you probably have guessed in what PCAnywhere is able to do to keep the server connection alive.

Just another thought I would ensure that tcp/ip has priority over other protocols on the network config>> bindings Tab.

hth

Happyclem :)
 
Hi johan,

I would suggest you get a hold of a demo version of one of flukenetworks nettools. Install on a standalone server and let it turn for a day or two. Simple to install and set up. It should definetely tell you exactly where you problem is. For us it diagnosed an aggregated link problem which created a loop in the network. It pointed to the switch port to Catalyst link causing the problem. CISCO changed a card updated the OS'S.
Mysterious behaviour good luck.
 
is it the same server that loses connection ?

if so id look into the hard drive being faulty .
had similar problems with fujitsu-siemens server when the drive was spining down as you said
"We have the impression that the problem exists more when network traffic is low" it would take to long to spin up so we had strange file access results.
 
Added more RAM to one of your servers lately? Then it's possible the BIOS settings of that comp have changed. Especially look for functions that let the comp go into "sleep" mode after a certain period of inactivity and disable them.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top