Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

NICs, SmartStart & MSCS, lengthy!

Status
Not open for further replies.

stonetown

MIS
Sep 22, 2001
36
GB
We have had problems recently, we are running Compaq Proliant 3000 with fibre channel running NT4.0, SP5 & MSCS. The servers have 3 NICs each, 2 are teamed using Compaq Tlan, one is used for the interconnect.

All has been fine until we added extra RAID cabinet (RA4100) and upgraded from SmartStart v4.6 to SmartStart v5.10. When running Compaq SSD from SmartStart CD, along with other drivers, the NIC & NetFlex drivers are also upgraded. When restarting the servers, the cluster will run OK on one node only (either node will run the cluster), but when the other node attempts to join the cluster it makes the whole cluster unavailable. When the cluster is running only on one node it reports that the other nodes network is unavailable (I am not sure if this is normal).

Uninstalling and re-installing the cluster does not work as a cluster is not properly formed on the first node, the only solution is a lengthy re-build of the servers from scratch, version 5.10 of SmartStart works fine when re-built from scratch.

Does anyone have any experience of this or know of a workaround?? There are no Technet articles or Compaq reports on this type of problem.
 
Just a guess:

Ensure the setting for the interconnect (which i assume is a 10baseT X-over cable) is set to auto - or try 1/2 duplex 10mbit speed.
 
Interconnect is setup and pings OK, further investigation shows that it is probably not NIC drivers causing the problem. Some more symptoms are that clusdisk has problems starting from command line. A lot of the time one node will run the cluster OK, but when clusdisk is started on second node the server hangs and the cluster becomes innaccessable requiring reboots of both nodes. So far re-built 3 clusters, only 4 to go. Next option is to remove cluster, perform upgrade, re-install cluster, easier said than done when it runs exchange, TVD etc. and not guaranteed to work.
 
The answer to this problem has been found:

Compaq Remote Insight Boards RIB were also being installed, this is a PCI bridging device and will affect the settings of any PCI cards that are in a higher number slot.

The Cluster servers are proliant 3000s running HAF200 cluster setup (fibre channel) using FC HBA x 2 (host bus adapters) if the RIB is in a lower nslot than HBA then HBA loses settings and is unable to connect to shared storage when one server already has control of the shared storage.

Answer is to install RIB in higher slot than HBA and run F10 to setup HW, other PCI cards such as NICs still need to be reconfigured but HBA are non-configurable so this is the only way to do it.

Tim.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top