I've been having a bit of a nightmare with our primary SQL cluster recently.
Following some firmware upgrades to both servers one has decided that it wanted a divorce and it refused to start the cluster service generating an error each time, nothing useful simply 'Cluster service refused to start' or along those lines.
Having researched this I was able check the registry and cluster service files and all seemed to check out. There were very few temp files also which I removed to check and the clean up also had no effect.
Eventually a colleague and I decided to evict the node and re-add it back in. I tried this, this weekend past with no success.
Every time I now try and add the node back into the cluster I get the following errors in the cluster.log:
00000d18.00000e1c::2010/02/15-12:29:12.888 ERR [NM] Unable to synchronize node information, status 1726.
00000d18.00000e1c::2010/02/15-12:29:12.888 ERR [CS] ClusterInitialize failed 1726
00000d18.00000e1c::2010/02/15-12:29:12.888 ERR [CS] Service Stopped. exit code = 1726
I've checked connectivity and nic binding orders all seems to be ok. Has anyone any suggestions on where to take this next?
