Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Mixed 4.11/5.0 tree possible problem

Status
Not open for further replies.

philwhite

IS-IT--Management
Nov 22, 2001
3
0
0
GB
Can anybody please help with this problem?

I have spent hours looking on various web sites for a solution, but I have found very little that helps.

This is the scenario

Mixed NW4.11/5.0 tree with DS v6.11 +v7.55

19 NW4.11 servers & 9 NW 5.0 servers

ZEN 2.0 starter pack


This is the problem

About a week ago a NW5 server went down and crashed due to a hardware failure with the power supply. The server is on a remote site across a WAN link and serves about 20 users for file and print. Whilst it was down we were getting problems with many users at our main site and remote sites. The problem was as follows: -

NT workstations running client v4.70 were running the login script ok, but when the NAL application launcher started it would run for anything up to 30 minutes. The spinning “radar “ icon was visible in the system tray, but was spinning for up to 30 minutes. Application icons were not appearing on the desktop and the Start menu was not available until the NAL application explorer had finished.

The symptoms are very similar to those described in TID 10050790.

We got the server back up after about two hours, during this time many of our users were affected with the above problem.

Once the server was up I let the DS settle and ran a few SET DSTRACE =*H commands to force synchronisation. All was then fine with users logging in ok.

Later that same day, the server crashed again and the same symptoms reappeared. We managed to get the server back up and working the next morning and logged a hardware call to replace the PSU.

After doing some investigation of my own this is what I have found: -


Those users affected were loading NAL applications from one common container that is partitioned off in our NDS – called Applications.users

The replica ring for this is as follows

1 x Master - NW4.11 – Derby main-site
2 x R/W - 4.11 – Derby main site
1 x R/W - NW4.11 - Ashford – remote site –slow WAN link
1 x R/W - NW4.11 – Crewe remote site – slow WAN link
1 x R/W – NW4.11 – Plymouth remote site – slow WAN link
1 x R/W - problem NW5.0 server at Ramsgate remote site – slow WAN link

Can anybody suggest why this problem may be occurring when the NW5.0 server goes down? I am concerned that this server could fail again in future and cause more problems.


Any help would be much appreciated


Phil White – CNE
ITnet plc
 
I think it is the design of your NDS tree that's causing this issue.

First, in your login script, are you specifying from where NAL runs? If you have all WIN32 desktops, we've found it to be faster to run Z:\NALWIN32 versus running NAL and have *it* decide which platform to run...

When you go to Details in your NAL window, is the container the same for all of your sites ?

We have 22,600 users across the Nation, we have local app servers (3 to load balance) and all of our apps are local. It sounds like your apps are pulled from over the WAN ?

Also, on the router side, we are blocking SAP packets EXCEPT for the master replica amongst all of our novell servers *that are not local*...this is different for each region of course.

By doing this on the router side...clients will log in local to any of the local servers with replicas... we of course have a master replica of our "regional container" and the read writes are at the "fastest" wan link servers.

So...we allow replica ring synchs with the master replicas, we block everything else, but we also have our NDS tree setup such that we have a regional container and LOCAL apps on local servers, etc...

BUT, with budgets, you may be doing what you describe above...

I'd block SAP packets EXCEPT for the fastest wan links...so this forces them to look at the only fastest servers that they see when the local, remote server is down...else it appears they are going over the slow wan link..

Hope that helps!
Dean!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top