Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Westi on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Server Freezes for 5 minutes but no logs

Status
Not open for further replies.

gmail2

Programmer
Jun 15, 2005
987
IE
We have had this happen on a few of our servers now. They are Windows 2003 SP2 and occasionally, they'll "freeze" - users are unable to access anything on them and pressing CTRL + ALT + DELETE just gives the desktop background with no icons, taskbar etc. During this whole time however, the server replies if I ping it, but I can't RDP to it (althought I can telent to port 3389, so it's still listening)

After about 5 minutes, everything comes back to normal and users can access the server again, and it responds to my CTRL + ALT + DELETE from earlier. However, looking in the event logs reveals absolutely nothing. There may be an event about the WinHTTP Web Proxy Auto-Discovery Service starting/stopping but that's it. Could this be the root cause of the problem?

The servers are all ProLiant servers, some are DC's, some are file servers. The only non-ms software on them (apart from HP ACU etc) is TrendMicro OfficeScan and CA ArcServe (both of whom I've seen cause problems like this in the past - however, I can't stop these services long term just on the off chance that they are causing the problem).

I read somewhere that I should install the MS Debugging Tools, but will there be a file for me to debug after this considering it never actually crashed? If so, where would I find it?

Also, could I setup performance logs and alerts to monitor memory and CPU usage so that we could check afterwards if there was high CPU usage etc? If so, does anyone have any recommendations on what logs I should setup as I don't quiet understand them all?

The drivers and firmware for the servers are up to date to about October/November (which is pretty recent) and windows is up to date also to about the same time. Obviously there are updates that have been released since but haven't been installed because we're limited on having time windows to restart the servers.

Any help on any of this would be really greatly appreciated

Thanks in advance

Irish Poetry - Karen O'Connor
Irish Poetry and Short Stories - Doghouse Books
Garten und Landschaftsbau
 
Does this occur always at the same time and does it affect the same servers? You may want to check to see if you run any tasks on the affected servers during this time. Also you may have found your possible culprit in the software you mentioned. See if they run any scans during this period
 
Try disabling antivirus. I think thats your problem since its happing to all of them.
 
Most probable cause is that something is hogging the CPU. When you say "occasionally" - how do you define that? Daily? Couple of times per week? As other poster asked, any pattern as to time?

Capturing performance is a good idea but if you are doing it remotely, you might get timeouts in the data if the server is that crippled. You could try leaving server logged in, no screensaver, with task manager running and open to processes tab sorted by CPU useage. This should allow you to see what process is spiking.

PSTools has a pslist command that can be run against a remote server. You could see if that will return data during one of these episodes.
 
Thanks for all the replies guys. Unfortunately there's no pattern as to when it happens. I can't recall any time when it happened twice in one day, but it has happened twice in one week once. It may go months without happening and then all of a sudden happen twice in one week.

I alsready tried using pslist but it wasn't able to connect to the server so it wasn't of any help in this situation unfortunately.

I do agree that it may be AV, but because it happens so infrequently, we don't want to disable it completely just yet as these servers are file servers.

The idea of leaving Task Manager open is a good idea, but unfortunately some external vendors have access to our comms room so we can't leave the server unlocked.

Can anybody give me some tips or pointers on how I should configure performance logs and alerts? Writing the local disk is fine - although we can't check when the server is down, it's not usually down for more than 5 or 10 mins so we can check afterwards.

Thanks in advance

Irish Poetry - Karen O'Connor
Irish Poetry and Short Stories - Doghouse Books
Garten und Landschaftsbau
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top