Re: [Pacemaker] problem with VM in pacemaker cluster

Yuriy Demchenko Thu, 11 Apr 2013 03:00:26 -0700

Solved my problem

First error was in constraint: i've put constraint with "cxml" resourcealone, not with cloned "cxml-clone" - that's why "cxml" were moved firston "standby" command. after redefining constraint to "cxml-clone thantestVM" putting active node in standby went smooth - VM moved correctly,no errors.Second problem was because of "libvirt-guests" service, that issuspending my VM's on host reboot command. "chkconfig libvirt-guestsoff" command isnt enough, as it leaves symlinks "K01libvirt-guests" in/etc/rc.d/rcX.d . Removing that symlinks from rc3.d and rc6.d solvedproblem - now reboot process started with pacemaker shutdown andresources moved correctly to other nodes.


Yuriy Demchenko

On 04/10/2013 02:59 PM, Yuriy Demchenko wrote:

Hi,
I've set up 3-node cluster (2 active nodes + 1 standby for quorum),cman+pacemakerResources - "cxml-clone" gfs2 filesystem (cloned, run on both nodes)and "testVM" via heartbeat:VirtualDomain (domain xml located on gfs2fs, cLVM disk backend), set up constraints: "cxml-clone" started firstthan "testVM" (symmetrical, according to description it'll be stoppedin reverse order).While manual migration of VM runs fine (pcs resource move testVMnode-2/node-1) - succesfull live migration, VM runs uninterrupted, butwhen I'm trying to reboot/put in standby node running VM - everythingis crashing, migration fails and node fenced.
From logs i can see that resource "cxml" stopped first (orsimultaneously, at least not waiting for vm migration to complete),then migration fails because xml not available.
Apr 10 14:03:20 node-2 lrmd[2679]: notice: operation_finished:cxml_stop_0:3282 [ 2013/04/10_14:03:20 INFO: Running stop for/dev/cstore/cxml on /mnt ]Apr 10 14:03:20 node-2 lrmd[2679]: notice: operation_finished:cxml_stop_0:3282 [ 2013/04/10_14:03:20 INFO: Trying to unmount /mnt ]Apr 10 14:03:20 node-2 lrmd[2679]: notice: operation_finished:cxml_stop_0:3282 [ 2013/04/10_14:03:20 INFO: unmounted /mntsuccessfully ]Apr 10 14:03:20 node-2 crmd[2682]: notice: process_lrm_event: LRMoperation cxml_stop_0 (call=77, rc=0, cib-update=37, confirmed=true) okApr 10 14:03:21 node-2 lrmd[2679]: notice: operation_finished:testVM_migrate_to_0:3281 [ 2013/04/10_14:03:20 INFO: testvm: Startinglive migration to node-1 (using remote hypervisor URIqemu+ssh://node-1/system ). ]Apr 10 14:03:21 node-2 lrmd[2679]: notice: operation_finished:testVM_migrate_to_0:3281 [ error: Requested operation is not valid:domain is already active as 'testvm' ]Apr 10 14:03:21 node-2 lrmd[2679]: notice: operation_finished:testVM_migrate_to_0:3281 [ 2013/04/10_14:03:21 ERROR: testvm: livemigration to qemu+ssh://node-1/system failed: 1 ]Apr 10 14:03:21 node-2 crmd[2682]: notice: process_lrm_event: LRMoperation testVM_migrate_to_0 (call=75, rc=1, cib-update=38,confirmed=true) unknown errorApr 10 14:03:21 node-2 lrmd[2679]: notice: operation_finished:testVM_stop_0:3392 [ 2013/04/10_14:03:21 ERROR: Configuration file/mnt/testvm.xml does not exist or is not readable. ]
But wtf?! I've set up constraint, and "testVM" should be stopped/movedfirst, not "cxml"
What is wrong with my configuration, am I missing something?

logs and CIB in attach



_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] problem with VM in pacemaker cluster

Reply via email to