Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Lose comms to server on an a random basis

Status
Not open for further replies.

koresnordic

IS-IT--Management
Nov 28, 2002
422
GB
Hi,

I hope someone can help.

We have a small network operating under a xxx.xxx.xxx.nnn ip range. It consists of several desktops (roughly 40 inc laptops that come and go) and several servers (some virtualised). Just recently people have been reporting having problems on the network. What happens is they stop being able to communicate with a server. for example with my self I am using our .51 server (checking email) and it just stops. I try to ping and get no response. However after a couple of mins, I get the comms back. I haven't done anything to aid in this. In a short while I will lose comms to the .5 server (finance). Again no ping but in a few mins it comes back. This happens to everyone on the network at some stage (not always at the same time, I haven't been able to ping .51 but my collegue has). Anyone have any ideas on where to look? I am a novice on network troubleshooting so please go easy.

thanks

[pc]

Graham
 
Others will have a better solution, but if I was in your shoes I would look at how it is physically connected. If all your servers are on one bottleneck or, if even smaller network, all one one managed switch, it would be a good idea to start there especially since it happens to multiple servers. If that is the case, a quick and dirty way to test if the switch is giving out would be plug in a laptop w/ Teamviewer or Logmein and connect to it and then to one of your servers, if they both go out at the same time, bad switch likely or something further up the bottleneck.

Alot of this goes out the window if you are logging into the network via a VPN tho....

"You don't know what you got, till it's gone..
80's hair band Cinderella or ode to data backups???
 
Hi DrBob

We have 4 switches set in a stack (set up by my prececessor). These are linked via a short fibre link to what is marked as a core switch. All the servers are connected via gigabit connections to this core switch. As far as I can tell there is no segmenting taking place. The web page associated with the switch stack doeasn't show any massive overload on any one port, but the expected fibre link having by far the highest traffic.

In regards to checking if both systems go down at the same time, i can see my collegues screen scolling away with successful pings whilst mine goes. So I am not sure the switch hardware is at fault (but I could be wrong and just a few ports having problems?)

[pc]

Graham
 
Have you checked into whatever server is doing your DNS/DHCP and seen if any errors are present? Are these roles installed on both of the servers you are speaking of or do you have a dedicated one?

"You don't know what you got, till it's gone..
80's hair band Cinderella or ode to data backups???
 
but I could be wrong and just a few ports having problems"

You should have few (over say, a weeks time), if any errors on any given switch port Old hardware based machines (NIC) will often produce more than a few errors, generally they do not cause major problems.Check ALL machines for malware/viruses

Are the ports having issue or is it the connected machine/device with either a NIC issue or software bug. Seen program bugs take entire networks down, especially from laptops brought in from home

Are the wires connected to those ports up to CAT specs, possible damaged wire.

Open Taskmanager on servers, sort on memory use, check for memory leaks or very high useage,



........................................
Chernobyl disaster..a must see pictorial
 
Hi DrBob

The DNS/DHCP server doesn't appear to have any errors in the event viewer that are related to this - security is fine, problem with another server having an incorrect password but that is about it.

Hi Technome

I haven't a clue if the ports are having problems. But it occurs to several machines so unlikely. There are no errors on the switch status table for the last week (when the stats were rset so we could check). The network cables all look fine. There doesn't appear to be memory leaks as the top programmes (for memory) are not increasing thier usage. But your mention of virus checking rings a bell. Not so much in that I think there is one, but these problems seemed to start around the time of our last anti-virus programme update (as opposed to definitions update). Will look into that I think

[pc]

Graham
 
Looks like my post got deleted.

Have you checked that your AV is up to date and that the NICs have the latest firmware/drivers.

I have had issues in the past with HP servers that were only resolved after installing the latest PSP and firmware CD.

I also found that we had a memory leak that caused random servers to drop off the domain and they could only be accessed by using the local admin account. That was pinned down to pcanywhere in the end.

Also are you running any form of IDS that scans the network, we had a pen test scan for PCI DSS recently that messed up the TCP stack on a lot of servers and they also lost connectivity.



Biglebowskis Razor - with all things being equal if you still can't find the answer have a shave and go down the pub.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top