However, the way to obtain Cluster by IBM ServeRAID, is almost different from the others.
Typically on host there is just a scsi card, not a RAID one: the RAID controller is on the storage. Is the storage that knows the ARRAY layout and does RAID policy
(ie: disk0 and 1 are in RAID1, disk 2,3,4 in RAID5).
Using ServeRAID Cards (they are not simple scsi but have also RAID capability), BOTH cards MUST know the same ARRAY LAYOUT.
Well, when happens a failover (suppose you reboot host2),
a scsi bus reset is sent, and host1 take control of virtual disks. When host2 comes on, its ServeRAID card
see lost the disks, at bios level, when you have not yet loaded OS and then IBM-Windows-ServeRAID-DeviceDriver is not started. When OS comes on and the part of Cluster IBM solutions starts, all return ok.
I don'know what is your situation, you are in production,
or in test phase. Tell us also the BIOS level of the card,
and level of Windows device driver, and the type-model of the servers.
Some years ago, I wrote something about this faq491-3942 .
The IBM link inside it is not more available.
You can download CD ver 6.0 (version 7 sometimes does not go
with card ServeRAID 6? with some servers)
Thank you very much for your comments. I am especially interested in the point you make about BOTH cards needing to know the configuration of the array sub-system.
Are you saying that when configuring... both ServeRAID cards should be 'logically defined' for physical disks, arrays, and logical disks. The documentation refers ONLY to Server A in this regard does not discuss Server B, as requiring this setup.
I am reading your material carefully... thanks,
John
hi John,
while you read docs and browse sites,
please submit us the missing infos:
- storage type and disk layout (5HDs: 0,1=RAID1 2-4=RAID5)
- servers Model-Type (IBM 3650 9999-4HG, HP DL380...)
- ServerRAID FW Level
- ServerRAID Device Driver version
- Is this a fresh setup, or you are in prod with user data ?
- Have you installed all, or you are the customer ?
New clean installation... it is to be a test environment for MS Exchange stuff.
- 2 IBM xSeries 336
- 2 LSI RAID integrated Controller 1 (2 channels)
- 2 ServeRAID 6M Controller 2 (2 channels)
- EXP 400
- 5 Disks, 3 Logical, 2 RAID-1, 1 RAID-0
- Q Disk Quorum is on Logical Disk 18 GB (RAID-1)
- M Disk (Exchange) on Logical Disk 70 GB (RAID-1)
- N Disk (RAID-0)
- ServeRAID Device Drivers are 7.12.11
- ServeRAID ROM flashed to 7.12.13
- ServeRAID Application = 9.0
- IBM did the initial installation in Japan.
- We flattened it and did a reinstall for technical reasons
- we are highly technical operating system level developers in satellite and cellular network software development... meaning that we should be able to do this.
We are using Channel 2 on the 6M Controller. I read somewhere that Microsoft Clustering has difficulty with the second channel LUN identification. But I can not find this document now.
I notice that you make the statement "Choose copy configuration from disks" with regard to Server B.
This is implying that BOTH cards (A and B) need to have logical disk definitions in their settings. The copy from disk must be a similar method as typing it back into Server B, but there is probably additional information that comes with it.
This could be the difference from what we have done.
One other perceived differece... the IBM help wizard says:
"Enter a merge-group number for the shared logical drive. This merge-group number must be unique from the partner controller (that is, the merge-group number for any logical drive belonging to the partner controller cannot be the same merge-group number)."
--
Does this mean that if I have on Server A, Merge Group numbers of 1,2,3 for three logical drives... then on Server B the merge group numbers need to be NOT 1,2,3?
--
As we have not recently explicitly 'configured' Server B for logical drives we have not validated this statement but using the ServeRAID application it is apparent that on BOTH nodes A and B, the merge groups are the same set of: 1,2,3
how you have noticed, the button Cluster in ServeRAID manager, is active only if you boot from ServeRAID CD.
The merge-group-number I believe is a sort of LUN.
step A: boot server1 by CD, cfg disk array (I think 1 RAID1 and 1 RAID5) ; check adapter SCSI-ID is 7 ; elect your 2 volumes in shared mode (cluster button), I don'remember if you can number volumes (however check if raid1 is vol1 and raid5 is vol2) ; write ServerA as servername and ServerB in partner name.
step B: leave disk to synchronize; then shutdown Srv1.
step C: boot Srv2 by CD, and import information from disk.
(this operation is used when your card fails, while the disks are good: in the disk there are the right array layout, while the new card, that come from Adapec, nothing knows about array layout); it is as you have migrated the storage from Srv1 to another but identical Srv2.
Step D: the previous op has copied all to card, but something is wrong and has to be changed:
1) The card scsi ID is 7 while the scsi ID of 2nd card
must be different: set it to 6.
Warning: no disks can have scsi id 6 or 7: the corresponding locations on DS400 MUST be empty
(probably you have no problem, becouse if you have 5 disks
you have used 0,1 2,3,4)
2) You have to swap ServerA and B between servername and partnername.
It is supposed to resync on failover. This is done to insure data integrity. When failover occurs, the controller taking control of the other nodes logical drives cannot know if all the stripes are coherent. If there are incoherent stripes (parity doesnt match data) and a drive subsequently fails... the rebuild operation will rebuild data based on incorrect parity and thus rebuild incorrect data. The resync fixes that exposure.
I believe I understand the process. There is only ONE outstanding question that I have.
The ServeRAID 6M does support IMPORT from Disk in Cluster MS Windows 200x mode.
So I am wondering how I create the Logical Disks on Server B. The choices that I can see would be:
1: Manually re-create as performed previously on Server A.
2: Some other method of import or copy...
But I do not see how #2 can be achieved and I am also not sure whether #1 approach is possible. I mean the second Controller may be to 'copy data' in some way as opposed to using local configuration that is co-incidently the SAME as on Server A.
--
QUESTION:
Can you be specific on how to configure Server B relative to that of Server A?
--
I have discovered that the IBM documentation did not say to wait for/ensure that Server A had synchronized before rebooting. I believe that this was the problem (or one of them) that we experienced.
OK, I imported the Server A configuration via "Copy from Disks" but this could only be performed from the BIOS ServeRAID controller application (that is NOT the IBM Support CD, as there was no functionality of this sort on the CD).
--
The Cluster is installed... and re-synching now...
--
Question:
When the second node of the Cluster is JOINING an existing node... does THAT node require ownership of the DISKs? I decided to leave ownership of the disks with the operational cluster node (A)... while (B) was joining. [But there is no reference in your procedure or IBMs on whether the installing node needs ownership of the Disks during this process.]
--
Question:
Does EACH node have to re-synch the disks?
Is there ONE Disk Cluster Allocation Table or does each Controller Node have it's own Disk Cluster Allocation Table?
I am trying to understand how much 'synchronizing' of disks is required. I can not find documentation on this aspect of Adaptec/ServeRAID documentation. The cluster looks fully functional but I believe the RAID may not be, as I would have thought that synchronizing disks would be performed once.
I do not think that the Microsoft Cluster creation Wizard knows about the ServeRAID system which is why it can not find 'disks' that qualify for cluster devices. When running the Wizard, the EXP400 disks never flash... which suggests to me that Microsoft does not know they exist.
--
I will attempt to apply any thoughts that you have.
10:13:50: The Cluster Service failed to bring the Resource Group "Cluster Group" completely online or offline.
10:13:50: The Cluster Service is attempting to bring online the Resource Group "Cluster Group".
10:13:49: Cluster node C32 was removed from the active server cluster membership.
Cluster service may have been stopped on the node, the node may have failed,
or the node may have lost communication with the other active server cluster nodes.
Log - System
10:15:38: Cluster service could not join an existing server cluster and could not form a new server cluster. Cluster service has terminated.
10:13:50: Cluster resource 'IPSHA Disk Q:' in Resource Group 'Cluster Group' failed.
10:13:50: The Cluster Service failed to bring the Resource Group "Cluster Group" completely online or offline.
10:13:50: Cluster resource 'IPSHA Disk Q:' in Resource Group 'Cluster Group' failed.
10:13:50: The Cluster Service failed to bring the Resource Group "Cluster Group" completely online or offline.
10:13:50: Cluster resource 'IPSHA Disk Q:' in Resource Group 'Cluster Group' failed.
10:13:50: The Cluster Service failed to bring the Resource Group "Cluster Group" completely online or offline.
10:13:49: The Cluster Service is attempting to bring online the Resource Group "Cluster Group".
Notice that Synchronize is attempted immediately (10:15:23) after Logical is Offline.
I would have thought that an offline drive would be attempted to bring Online.
hi,
also if you have imported (or copied) configuration from
disk using embedded BIOS, you have to change scsi id to 6
(probably you can again use BIOS), but to set in cluster
menu host=ServerB partner=ServerA, you need use ServeRAID CD. (I am sure there is also the voice "copy cfg from disk",
probably the label is not exactely so, but the sense is this).
All these operations, have to be done, BEFORE, beginning
MSCS setup.
I have only two (2) clustered drives in this instance. Quorum on Q: and Mail Disk on M:.
Now... the Q Drive can be see from BOTH nodes while the M Drive can only be seen from the Active Node... using Computer Management app.
The M: Drive is in a Resource Group by itself.
Is this a problem... does this Disk need to be part of a another group. MS Exchange is not installed yet.
also the Q disk must be in a separate resource group:
the cluster group.
About scsi id, I don't understand if you have set them
one to 6 and other to 7.
I have in my archive pdf about 6m regarding cluster configuration.
You can find 3 pdf in folder books in ServeRAID support CD.
On the web, I have found the same thing for 4x, but it's the same:
ftp://ftp.software.ibm.com/systems/support/system_x/19k6408.pdf, in "Controller considerations" on page 55.
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.