Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

pool disappear in sp4

Status
Not open for further replies.

twomargaritas

IS-IT--Management
Feb 28, 2002
14
GB
We have NetWare 6.5 sp4. Suddenly on Saturday we had all the data pools wiped out. SYS is still there, but all the rest has gone. It does not seem hardware issue as SYS itself has NSS partitions spread over two hard drives. The rest of the partitions are there but not used by any pool I had originally created. I cannot do any maintenance on the pool as they are not recognised by NetWare once you call them up. Once I list the pools on the console I just get SYS and _ADMIN. This is pretty scary. I can see the pools still there on ConsoleOne but cannot do anything as all the management is done by IManager.On IManager I just get the two active pools and the partitions how they were originally set up, but no linking to any pool or volumes.

No backup was involved once the pools disappeared. Anyone got a clue of what might have happened and how I can get the pools back?

Thanks


Eugenia

 
I would get on the phone to Netware support immediately. SP4 and 4a have NSS pool issues, fixed by 4b.
 
Are these pools on other hard drives, possible attached to another controller? Maybe the controller driver isn't loaded? Idunno.. I agree, get Novell involved. But understand that if it's gone, it's gone. They will tell you to restore from backup before they spend hours trying to fix it (if it's even possible to fix).

You should also look at the Storage from NSSMU, the server based utility. It will give you a more realistic picture of what is going on.

Marvin Huffaker, MCNE
 
Thanks for the posts and advice. Controller is OK.

The fact is that it is not the first time I get pool issues on a service pack. It seems that Novell want to keep us in the job, - but tell your users and employer that ops! it was yet an other service pack causing this. After last time when I had a complete disaster after installing one, I am even more cautious than normal and I resisted the installation until i could and obviously had a disaster recovery plan in place, praying that I can get on with the rest after it finishes.

I am not saying that I would turn to Windows as this is silly, but I am actually tired of doing nights because of bad service packs. NetWare is a good product and I would not leave GroupWise for anything else in the world.

The concern I have is that even if I restore from backup, what is to say that this won't happen tomorrow?
 
I work on hundreds of servers every year and I have never lost any data due to a service pack. It's not fair to make a statement that it's Novell's fault.

Something else is wrong here. Poor planning? Poor hardware setup? Trying to save a buck on hardware costs by using low end drives and spanning the volumes? What do you expect? Would Windows run any better on the same hardware? I doubt it.

NetWare is only as reliable as the hardware it runs on. A lot of NetWare servers run for years without problems, and then only fail when the hardware fails.

Not trying to be a rag here, but I see this all the time. Companies try to save money on IT by putting in workstation class systems as servers. Or putting in the cheapest server they can. And then when it has problems they wonder why and try to blame the OS.

Put some money into a decent server class system from a reliable name brand vendor with RAID (and I don't mean SATA raid) as well as redundant parts.. It will cost you more up front, but save you money in the long run. The cost of downtime is much higher than what you will spend on the server.





Marvin Huffaker, MCNE
 
twomargaritas said:
The concern I have is that even if I restore from backup, what is to say that this won't happen tomorrow?
You could either install SP4b or SP5 for a start. That is IF the problem is the SP as Marvin pointed out. You need Novell's help figuring that out.

Every (N)OS manufacturer runs into odd problems with patches, and Novell has certainly had their share in the last couple of years. Of course you should have a full Disaster Recovery plan in place before altering your system libraries, even with Netware.

Marvin said:
(and I don't mean SATA raid)
Let me second that. We have been badly burned by SATA on the desktop, I wouldn't put one in a server on a bet.
 
Just for clarity, did you upgrade the storage drivers during an sp install? Or was everything fine when you left, then bad when you returned, with no configuration changes? Did you check your sys$log.err and vol$log.err to see if there was a reboot, or some indication that another person was monkeying at the server console?

A call to Novell tech support would be well worth it IMO. Best of luck to you.
 
I did not really mean to offend any of the NetWare supporters. In fact I am one of them. Some replies are though due and I think some comments made are not justified. We are actually the ones who do not save on hardware nor we buy cheap. It is not worth it. One of the drives was new but not bought cheaply, nor we span volumes. We also bought a few years ago top of the range RAID system, still not on the cheap, which had the controller failing so often, I was left every time with nothing. After a few repairs (still not on the cheap) I ditched the whole thing. I am sure I might have been very unlucky. My issue was just to understand the cause of it so that it would not happen again. If it is a patch I will repach, if it is hardware I replace. It is just sometimes depressing spending your nights dealing with issues which might be caused by poor software writing. I am though sorry to say I have not been the only one having disasters happening for this. And it happens to NetWare too, I am afraid. As to hardware running on Windows, we actually do. All we have are Dell servers (not cheap ones), both for NetWare and for Windows. We use Windows server just for some applications though as those are not compatible with NetWare. Again, I would not turn my NetWare server into a Windows under any circumstance. For the reason we know, although our servers have run rather happily for years (we obviously do not have Exchange).

As to the real pool issue, it was sudden, 9 am in the morning. No-one tampered with it. However I might have an answer to it, as I managed to find some time to check the server properly after the immediate emergency of recovering the data to the backup server I have.

One of the drives is not active anymore. If that failed and the partitions of the pools which disappeared were ONLY on the failing one, the system would have retained SYS (which did not have any partitions on the failing drive) but all the other pools on the other drives would have gone. I logged in to refer about my findings, and it is just fair. This is a supposition, but I think a pretty plausible one. Correct me if I am wrong.

All our data has been recovered through backup. I have a good backup procedure and it has worked until now.

Sorry again if I have upset anyone, there was no intention, just tiredness. Understanding the real cause of an issue is to me important, only because of prevention (if this is possible) Having a family and being pregnant probably does not help filing miserable occasionally especially on such occasions.

Thanks for all the help. I mean it.

 
twomargaritas said:
This is a supposition, but I think a pretty plausible one.
Agreed.

A RAID array would've saved you from this. I'm surprised that you've had trouble with RAID on the Dells. I've got a dozen PE2650 and PE2850's running Netware, Windows, and linux. All using the standard issue PERC RAID controllers Dell sells, all without problems. They're not the fastest, but they've been very reliable for me.
 
Just a couple more notes, I don't want to sound like I'm harping on this.

I've seen RAID controllers fail and take out the system. Catastrophic. It happens. It's not a 100% but better than not having anything.

A good practice with any service pack is to wait 30-60 days past any new release. Watch the Novell forums during that time and you will soon get a good idea if there are any serious problems. If there is, Novell usually responds with a Post SP patch, like they did with SP4. Or they will put a post SP NSS patch, if the issue is with NSS.

There was a specific documented problem with SP4 that could cause NSS pool failure <IF> you ran a Pool Rebuild.. This was found shortly after SP4 was released, and Novell announced it and fixed it. They are very good about responding to problems found - I give them credit for that. Some vendors won't even acknowledge problems.



Marvin Huffaker, MCNE
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top