Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Veritas Volume Manager Problem

Status
Not open for further replies.

100mbs

MIS
Feb 14, 2002
142
US
OK I had this same problem a few weeks ago and a reboot of the server fixed the problem.

For some reason every time i remove a disk for replacement then put the new disk in and do the following i get errors.

Run format and label the new disk, then exit.
Run vxdctl enable.
Run vxdiskadm.
Replace removed disk.

Now after running the above commands it get a print out that looks like this below: I have ran the above commands a few times and everytime it keeps adding more disk devices.


Enter disk device or "all" [<address>,all,q,?] (default: all)

DEVICE DISK GROUP STATUS
c1t0d0 c1t0d0 rootdg online
c1t1d0 c1t1d0 rootdg online
c1t2d0 c1t2d0 rootdg online
c1t3d0 c1t3d0 rootdg online
c1t4d0 c1t4d0 rootdg online
c1t5d0 c1t5d0 rootdg online
c2t0d0 c2t0d0 rootdg online
c2t1d0 c2t1d0 rootdg online failing
c2t2d0 c2t2d0 rootdg online
c2t3d0 c2t3d0 rootdg online
c2t4d0 c2t4d0 rootdg online
c2t5d0 - - error
c2t5d0 - - error
c2t5d0 - - error
c2t5d0 - - error
c2t5d0 - - error
c5t1d0 c3t1d0-t0 orcl2-t3 online
c6t0d0 c5t0d0-t1 orcl2-t3 online

I restarted the vxconfigd services and this is what the vxdisk list looked like:

DEVICE DISK GROUP STATUS
c1t0d0 c1t0d0 rootdg online
c1t1d0 c1t1d0 rootdg online
c1t2d0 c1t2d0 rootdg online
c1t3d0 c1t3d0 rootdg online
c1t4d0 c1t4d0 rootdg online
c1t5d0 c1t5d0 rootdg online
c2t0d0 c2t0d0 rootdg online
c2t1d0 c2t1d0 rootdg online failing
c2t2d0 c2t2d0 rootdg online
c2t3d0 c2t3d0 rootdg online
c2t4d0 c2t4d0 rootdg online
c2t5d0 - - error
c2t5d0 - - offline
c2t5d0 - - offline
c2t5d0 - - offline
c2t5d0 - - error
c5t1d0 c3t1d0-t0 orcl2-t3 online
c6t0d0 c5t0d0-t1 orcl2-t3 online


WHat the heck am i doing wrong i cant remove and replace a failed disk with out this happeneing.


Thanks in advance for any help.
 
It's certainly very strange... I would raise a support case with Symantec in your situation.

Annihilannic.
 
100mbs Hi
if this disk is scsi my guse is the id of the new disk you added is 7 so it is the same like the controller and they are bumping so this is the reason you see so many disks.
try to take off the disk check for jumpers and install the disk in uniqe id

 
100mbs;

I have seen a similar issue when replacing a encapsulated root mirror. The reason for the problem I saw was a customer had a c0t0d0 drive go bad in a 280R which has fibre drives. The customers dump device was pointing to c0t0d0s1 since you can't point it to swapvol and the customer started the replacement procedure using option 4 and without changeing the dumpdevice to there c0t1d0s1 drive. The customer then ran the luxadm remove_device command on the drive using the -f flag, then inserted new drive with the luxadm insert_device command, the devfsadm, the vxdctl enable. Then they saw a duplicate entry in vxdisk list for c0t0d0 device and was unable to perform option 5 to replace the bad disk. Only solution per sun is to reboot. So you need to make sure there are no mounts to the device or the device is not being used as dumpdevice.
If the customer had moved the dump device to c0t1d0s1 before executing any veritas options or commands they would not have had this problem.

What really happens is the luxadm remove_device does not actually remove the /dev/dsk, /dev/rdsk and /devices or Veritas dmp entries for the bad drive, and when you insert a new device Veritas can't correctly reference the WWN of the new drive.

Why don't you give us a little more background to your problem as far as what type of drive (fibre/scsi), what type of storage array it is in. Is the device raid 5f, mirrored etc.. Do you use a hotpspare pool for the diskgroup etc...


Thanks CA

 
I am running Solaris 2.8 on a V880. The drive in question is one of the internal drives. It is setup as a RAID 5.

The system was running fine before the c2t5d0 device failed.

Here are the emails i got from the server when the drive failed.

1. Attempting to relocate subdisk c2t5d0-03 from plex u044-P11.
Dev_offset - length 31457280 dm_name c2t5d0 da_name c2t5d0s2.
The available plex u044-P12 will be used recover the data.


2. Failures have been detected by the VERITAS Volume Manager:

failed plexes:
oback-P14
u044-P11

failing disks:
c2t1d0


3. VERITAS Volume Manager is preparing to relocate for diskgroup rootdg.
Saving the current configuration in:
/etc/vx/saveconfig.d/rootdg.070404_102659.mpvsh

4. Relocation was not successful for subdisks on disk c2t5d0 in volume u044-L04 in disk group rootdg. No replacement was made and the disk is still unusable.

The following volumes have storage on c2t5d0:

u044-L04

These volumes are still usable, but the the redundancy of those volumes is reduced. Any RAID-5 volumes with storage on the failed disk may become unusable in the face of further failures.



 
100MBS;

after you do option 4 are you running the luxadm remove_device C#T#D#, waiting for the prompt to remove device and then luxadm insert_device c#t#d# wait for prompt to insert device.

Then devfsadm, vxdctl enable, then option 5.

If you have a sunsolve contract go to the v880 server page and look in infodocs which will have a replacement procedure for replacement of internal fibre drive running Veritas(doc id 40842).

Thanks CA

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top