vxvm issue 1

samurai123 · Dec 5, 2008

We are getting following errors while trying to mirror a volume

VxVM vxassist ERROR V-5-1-10127 changing volume 10
Operation requires transaction

vxvm:vxconfigd: [ID 655802 daemon.warning]

V-5-1-11550 Timed out transaction for client 18079. Try setting the environment variable VXVM_TRANS_MAX_TIMEOUT (600) to a higher value and restart vxconfigd.

Any idea?

Annihilannic · Dec 7, 2008

Have you tried restarting vxconfigd yet? Try this command:

Code:

vxconfigd -k -m enable

It shouldn't affect the availability of any volumes; the only things you can't do while vxconfigd is not running are make any VxVM changes or run things like vxprint.

Annihilannic.

samurai123 · Dec 7, 2008

Thank you Annihilannic for your reply.

We could able to mirror the volume when you attempted the operation after few minutes without restarting the vxconfigd.
But we faced the same problem when you tried to mirror another volume in the same DG which is used for Oracle DB.

The environment is Veritas SF 4.1 MP1. It seems that the mirror operation does some prep work before it starts actual mirroring. Maybe because of the use of the volume, this prep work takes more than the default timeout value 600 secs which defined by the variable VXVM_TRANS_MAX_TIMEOUT.

We don't see any reference of this variable in the documentation. We are not sure where and how to defined this so called environment variable. Is it safe to restart the vxconfigd the way you suggested in the production environment?

Setting the environment value more than 600 sec would probably avoid the error but it's still a workaround.

We are wondering if there is fix to this problem?
We don't see this issue being resolved in the MP2. Maybe it's not an issue but we don't see any reference made by Veritas about this variable. Strange isn't it?

Annihilannic · Dec 7, 2008

What command(s) were you using to mirror the volume? Does it have DCO and/or DRL logs attached?

There shouldn't be a problem restarting vxconfigd, even in a production environment. The only thing you should be aware of is if you have some kind of monitoring that depends on vxprint or vxdisk list commands, it could time out while vxconfigd is being restarted, as it can take a long time on a system with a large number of disks, or disks that are slow to respond for some reason. This especially important if you are running clustering software which may think something has failed due to the slow response, in which case you may want to stop or freeze the cluster software while you are doing the restart.

It seems ridiculous that it would require more than 600 seconds to "prepare" to mirror a volume, so there must be something wrong somewhere. It's not that strange that it's not documented... in versions 4 and higher of VxVM it has become so complex and so bug-ridden that there are all sorts of hidden/undocumented fixes and workarounds for stuff like this... what was once a great product has really gone down in my estimation in recent years.

I'm not sure where the best place is to set that variable permanently... but if you just want to try it temporarily you should be able to do something like:

Code:

VXVM_TRANS_MAX_TIMEOUT=1200 vxconfigd -k -m enable

Annihilannic.

samurai123 · Dec 7, 2008

The command used was
# vxassist -g <DG> mirror <volume> layout=stripe ncol=15 <sd-names...>

There is no DCO or DRL attached to the volumes.

We did freeze the cluster service groups on the node to avoid un-wanted failover even for mirroring the volumes and glad we did that. The VCS monitoring agents ( perl scripts) do use vxprint and vxdisk list commands. So you are right there.

We do suspect something is holding the "prep" work. Maybe something gets precedence to it but don't know how to pin-point the cause. Maybe the IO load or something from storage (EMC) side?

Annihilannic · Dec 8, 2008

Well, if you really need to find out you can put vxconfigd in debug mode using vxdctl debug n /path/to/logfile to figure out where exactly it is spending all that time. n is a number from 0 to 9, 0 turns debug output off, 1 to 9 are increasing levels of information. Larger numbers generate massive amounts of output and slow vxconfigd down considerably. The output can be quite cryptic and you may need Symantec/Veritas to interpret it for you anyway.

Another thing to try would be to keep an eye on vxconfigd in top or prstat while the mirror operation is performed, and also capture it's CPU time consumed before and after the operation to see if the time is being spent there or somewhere else (the kernel?).

Annihilannic.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

vxvm issue 1

samurai123

Technical User

Annihilannic

MIS

samurai123

Technical User

Annihilannic

MIS

samurai123

Technical User

Annihilannic

MIS

Similar threads

Part and Inventory Search

Sponsor