Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

CPU Hog Abend Running process: NWMKDE.NLM

Status
Not open for further replies.

snorkel

MIS
Mar 26, 2002
118
0
0
US
Here's my abend problem. Netware 6.5 SP1A on IBM Netfinity server (Pentium 3 866 with 1.0 GB RAM) and Arcserve 9.01.

No problems before upgrade from Netware 5.0 and Backup Exec. Now, every night (almost) during backup, I get the abend noted below. Hard crash. A memory test of the server shows that it is good. We have tried three different tape drives (2 LTO and 1 AIT) and it makes no difference.

So, the server is fully patched and so is Arcserve, so I don't know what to try next. Any suggestions welcome.

Server ES-NIAG halted Saturday, March 6, 2004 7:33:36.186 pm
Abend 1 on P00: Server-5.70-1937: CPU Hog Detected by Timer

Registers:
CS = 0008 DS = 0010 ES = 0010 FS = 0010 GS = 005B SS = 0010
EAX = FCE1767D EBX = 00000003 ECX = F8D3A520 EDX = B2444C9E
ESI = B25278C4 EDI = B2445E8C EBP = B03C4980 ESP = B2527494
EIP = 00000000 FLAGS = 00000097


Running process: NWMKDE.NLM 20 Process
Thread Owned by NLM: NWMKDE.NLM
Stack pointer: B25274CC
OS Stack limit: B2522B60
Scheduling priority: 67371008
Wait state: 5050030 Blocked on Semaphore
Stack: --FCE1767D ?
BFA1C2D6 (NWMKDE.NLM|(Code Start)+52D6)
--B25278C4 ?
--B25278C4 ?

Additional Information:
The NetWare OS detected a problem with the system while executing a process owned by SERVER.NLM. It may be the source of the problem or there may have been a memory corruption.


John
 
John, change your CPU HOG Setting to 0. By default, the server is set to abend after 1 minute if a process is hogging the CPU -- a very common thing with backup software. Sometimes it's just in the middle of an intense process and the server thinks it's hung, so it abends.

You do this in Monitor -- Server Parameters -- CPU HOG TIMEOUT AMOUNT.. set it to 0 to disable.

Once you have done that, if there are other problems going on, you will avoid the CPU HOG abend and be able to find the real problem. But I think you'll be okay.


Marvin Huffaker MCNE, CNE
Marvin Huffaker Consulting
 
I'll ask my consultant what he thinks of that suggestion before implementing. There was also a new Arcserve patch that I will apply. Thanks for the suggestion. By the way, that parameter is under "Miscellaneous" in Netware 6.5

I'll reply back if I accept that suggestion and if it works!!

John
 
John, you are correct, it's under Miscellaneous. But something was wrong with my keyboard the other day and it randomly ommitted entire words as I was typing them in. That setting is also on NetWare 6 and 5 too. Might even be on 4. It's been around for a while.

You should also probably add some more RAM. 1GB on NetWare 6.5 is a little low. Nothing to do with your current problem, but NW6.5 is a mem hog compared to previous versions.

Marvin Huffaker MCNE, CNE
Marvin Huffaker Consulting
 
I had the same problem using ARCServe 9 and TapeWare 7.0. My backup went fine for 26 in ARCServe and twice at 17% in TapeWare it quit at the same exact file, "srdetail.dbf". I applied the CPU HOG TIMEOUT AMOUNT (set equal to 0) and the next time it did a server abend and then the server shut off. Again at 17%, same file. It can't get past that file.
What is the best way to find out more info on what actually caused the abend and shutdown during tape backup?

Thanks,
 

TID10090829

Modifying CPU Hog Amount Timeout to 0 can be dangerous.

Regards.
 
Fedayn, you should elaborate on why you think it's dangerous, otherwise I would consider it heresay.
Preventing the server from abending on a CPU HOG allows the system to continue to operate. If there is another source of the problem, it will allow that problem to surface. If it was simply a processor intensive operation, such as a backup, then the server should get past it and finish normally.

Captkirk, if the server still abended after you made this change, you should check the abend.log file and see if it gives you any more clues as to the problem.



Marvin Huffaker MCNE, CNE
Marvin Huffaker Consulting
 
After talking with TapeWare support Friday, they suggested I install their SP3B, which I did. They said they support 6.5 with SP3B. No change, the server ABEND happened at the same point in the backup, 17%, and they same file even.
Then I tried removing the folder from the backup that contained the suspect files at 17%. After I removed that folder from the backup job it completed the backup successfully, twice in fact. I didn't know files could do that in a backup; cause the server to ABEND.
My 6.5 server was abending and then restarting itself with the CPU HOG set to 0.
Anyway, my new 6.5 server is now running smoothly and finished the first day with usual traffic. I did have to upgrade the firmware on 3 HPLJ5M printers to get them to work, but they were at version 5, now 8.49.
One question - Tapeware support said I should not use NWTAPE.CDM but NWASPI.CMD instead. They told me to remove it. When I removed NWTAPE the server did an ABEND as it was looking for some way to connect the tape drive. So I put it back. It seems NWTAPE connects the SeaGate Scorpion 40 DAT drive to NetWare and NWASPI connects the Tapeware 7.0 software to the tape drive.
It is working find with both in, NWTAPE before the HAMs in Startup.ncf and NWASPI in Autoexec.ncf just before TWAdmin, which starts Tapeware.

Is my thinking right?
 
In response to Marvin Huffaker's original suggestion, I set the CPU HOG TIMEOUT AMOUNT to 30 minutes and that fixed my problem. I figured that 0 would mask the problem while 30 would allow for taking all but a "real" CPU HOG problem out of the equation. If I was real smart (or brave), I would ratchet down the time back towards 1 minute to see what the threshold is.
 
CAPTNKIRK, I don't know about Tapeware specifically, but in most backup software products, they use their own tape device driver and do not use NWTAPE. I usually delete it to prevent it from auto-loading. It should be located at C:\NWSERVER\DRIVERS. Also, NWASPI would usually go in your STARTUP.NCF.

Another thing about TapeWare.. I've never supported it, only seen it running on a server. But it looks almost identical to NovaNet's backup product. In NovaNet's case, any time a Novell patch is applied, it breaks NovaNet. Usually a patch comes out a few days later to fix NovaNet. That little bit of knowledge can save a lot of time wasted on troubleshooting the product.

Marvin Huffaker MCNE, CNE
Marvin Huffaker Consulting
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top