Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

NT 4.0 Server Crashing Once Per Day??? Blue Screen Crash Locks System

Status
Not open for further replies.

wncads

Vendor
Nov 1, 1999
181
0
0
US
Once per day at slightly different times the NT 4.0 Dedicated Server is crashing with approx 20 workstations connected to it. The server is untouched in a secluded key operated room where cooling and environment are not an issue. This is a HP NET 60 PIII500 Dual Server with Dual 9.1SCSI Hard Drives and an HP DAT 20Gb Tape Back up.

The Blue Screen starts out with XXXSTOP:0X0000000A then IRQL_NOT_LESS_OR EQUAL.....then CPUID INFO and SYSVER info.
then a long list of items in order under the headings
DLLBASE DATE STAMP NAME (i.e. where the name is basically all of the systems running on the server such at atapi.sys, ntfs.sys, tape.sys, cdrom.sys, beep.sys, vga.sys, etc)...

There seems to be no specific rhyme or reason for the timing but it is occuring approximately once every 24 hours with nothing logging in the EVENT VIEWER. Please advise with your suggestions ASAP. Thanks very much for your help and advise. Keith Buchanan
AplusNetworking.com
Commercial I.T. Networking & Wiring Specialist
keith@aplusnetworking.com
 
If nothing is being logged into event viewer, it sort of points to hardware, not the O/S.
To prove this (and eliminate the possibility of hub/wiring problems) try restarting the Server sometime when no one will be using it (over the weekend). Disconnect the network cable and stop and programs running. Let it sit there and see if it crashes.
 
The last 10 or so items on the list should give you a pointer to the area in which problems are occurring.

eg if you see a lot of atapi.sys, aic7xxx and other disk-related drivers, then upgrade the SCSI driver.

In my experience, the drivers that cause most problems are;

1. Printers
2. NIC drivers
3. SCSI drivers
4. Graphics drivers

Hope this helps
 
Check the memory on it, it may be wonky. I had the same problem on my system, same "IRQ_NOT_LESS_THAN_OR_EQUAL" msg w/a lot of *.sys errors. Turned out the memory was bad (the system recognized the memory but when certain memory address spaces were being requested for access the system would crash). I went thru and updated all the drivers, changed out various components etc but in the end it was just a bad 64mb module :) good luck!
 
Try service packing it again, even if it's up to the latest service pack. There's a slim possibility that this could cure it. We have SP6a running on all our NT servers with no problems at all, so you don't need to be too wary of getting your server up to 6a if it isn't already.
 
Looking at UNIX but looks like UNIX will take quite a while to learn all the commands from SCRATCH.....

Also, FYI, changed the RAM out and will know in 24 Hours whether that was the problem, I too agree with TOBMELLO as Hardware was my guess, and that would be the most likely culprit. Changed out the RAM today and the error is consistantly popping up every 18-24 hours and crashing the machine so, if the server is still running at 2pm tomorrow (FRIDAY) we may have the problem licked. Thanks for the ideas. Keith Buchanan
AplusNetworking.com
Commercial I.T. Networking & Wiring Specialist
keith@aplusnetworking.com
 
NOT the RAM, server crashed again today and is also losing Dial Up Network settings about every third day.

Still looking for suggestions, trying what everyone has suggested over the weekend. Any additional comments would be appreciated.

It is running Small Business Server 4.0
build 1381 w/ SP3
WINNT Ver. 4.00.1381
IE Ver 5.5.50.4522.1800

Plenty of SCSI hard drive space and they are operating fine, I have turned off fault tolerance, and stopped all mirroring, turned on all logging for errors, changed ram which is 256mb, its a PIII500Mhz, w/ 9.1SCSI and only 1gb is used total on any drive. All of the cooling systems and fans are working properly, the unit has not been moved at all.

?????????????????????????

Thanks for your help. Keith Buchanan
AplusNetworking.com
Commercial I.T. Networking & Wiring Specialist
keith@aplusnetworking.com
 
Next time it dumps, make a note of the very last file in the mid-screen left-hand column of the blue-screen. The last file there (in the left column) is what crashed it. Most common BSOD's are SRV.SYS, CDFS.SYS, and PScript.dll.
You said you have SP3? DEFINITELY update to SP6a. SRV.sys is a common problem to older service packs, and there is even a hotfix available on MS' site if you cant get SP6a.
BUT - you are in DIRE need of a higher service pack in todays age. SP3 has got to be at least 3-4 years old by now..a lot has changed since then! :)

Also - is it plugged into a UPS? Make sure the UPS is not plugged into the serial port on the Server, and using the NT UPS feature in control panel. Try unplugging the Serial cable from the UPS (if applicable), and see what that does. Most Blue-Screens are I/O related.

Lastly, configure your computer to Write Debugging info after a "STOP" error. It's in your System Properties. This will log/record all future BSOD's and their info. BE CAREFUL! - make sure you have at least enough free space on the disk to write a file that is the amount of RAM you have, plus 1MB...in your case - 257MB.
 
Actually, to record debug info you need RAM+12Mb. in this case 268Mb.

Apart from that, spot on - I'd recommend at least SP4 on NT4 (SP3 is known to be buggy - and unpredictable with IE5).

CE
 
The Knowledge Base at the Microsoft Website has some very good information for this type of problem.

Here is a quote from article Q165863. -- "This stop message indicates that an attempt was made in kernel mode to touch pageable memory at a process internal request level (IRQL) that is too high. This error is usually caused by drivers using improper addresses. "

So, like most bluescreens, this is probably a driver problem. Article Q130802 details specific steps to isolate which driver is causing the problem. (It looks a bit complicated though.)

If you can't figure out which driver it is, I would check out each vendors website for network, scsi (and printer and graphics) drivers and download the latest versions of each driver. (Based on experience, I would start with the nic drivers.)

Running IE5.5 on a SP3 system? My own opinion is that this must cause you to have a not-so-good mix of old and new dll's on your system. I would upgrade to SP6A before even trying to troubleshoot this problem.

Good luck! I have experienced this type of problem before, and it was miserable. I had a Terminal Server which would crash every day or two, and I finally figured out that it was the nfs client software (of all things.)

Roger, NT System Administrator
 
GOT IT FIXED GUYS.... Turned out the on-board NIC was failing and getting ready to go out. Disabled it in CMOS and installed a new seperate NIC and it hasn't crashed nor reported ANY ERRORS whatsoever in the EVENT VIEWER for five days now. Thanks for all the great tips... Keith Buchanan
AplusNetworking.com
Commercial I.T. Networking & Wiring Specialist
keith@aplusnetworking.com
 
You have an on-board NIC in that Server???
Thats sacrilidge!!!!
Go with separates...an on-board failure can wreak havoc on other on-board components VERY easily..

just being a "Mom" on this one - NO SERVER SHOULD EVER HAVE AN ON-BOARD NIC!!!
 
I realize that, all HP NETSERVERS come with 3com onboard NIC built on which THEY RECOMMEND using.... go figure, and anyone else out there with HP NETSERVERS should consider doing the same as I did.... no regrets here... Keith Buchanan
AplusNetworking.com
Commercial I.T. Networking & Wiring Specialist
keith@aplusnetworking.com
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top