Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Resource Failover problem

Status
Not open for further replies.

Wazzer

MIS
Aug 6, 2001
17
0
0
GB
We have two Compaq 8500R Servers connected via fibre channel to a StorageWorks 4100 RAID Array using Windows NT4 SP6a. I am currently commissioning the above Cluster and I have a reliability problem when failing over resources. When I fail over a group from Cluster node 1 to Cluster node 2, most of the time it will fail over without problem. Occassionally however, a group fails at the disk resource level. Running Disk Administrator on node 2 shows it cannot see the disk resource. Shutting down node 1 so that all resources fail over to node 2 will not work for that failed disk resource although all other resources failover fine. Again, disk administrator fails to see the drive. The only way to get it to fail over properly is to evict node 1. Then the failed resource automatically starts correctly on node 2. This suggests that the cluster service isn't relinquishing control of the disk resource on node 1 so node 2 can take it over, and it's only when I evict the node from the cluster which forcefully relinquishes its control over the resource.
This can happen for any of the disk resources and can happen on either node so the problem is not down to an individual resource or a particular node.

Anyone come across this problem, or can offer a solution?
 
Just a thought, but I had trouble with failover on my 2000 advanced server cluster and I found that the shared resource had write cache enabled in the RAID controller. This caused the controller to hang on to the resource under heavy load.
 
Thanks for that. The RA4100 controller was set to 50% Read and 50% write cache by default. I've now set it to 100% Read and 0% write cache and will test during the day.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top