Hi All,

I am using a simple two-nodes cluster with Xen on top of DRBD in primary/primary mode (necessary for live migration). My configuration is quite simple:

primitive appyul1 ocf:heartbeat:Xen \
        params xmfile="/etc/xen/appyul1.cfg" shutdown_timeout="299" \
        op monitor interval="10s" timeout="300s" \
        op start interval="0s" timeout="180s" \
        op stop interval="0s" timeout="300s" \
        op migrate_from interval="0s" timeout="180s" \
        op migrate_to interval="0s" timeout="180s" \
        meta target-role="Started" allow-migrate="true" is-managed="true"
primitive appyul1slash-DRBD ocf:linbit:drbd \
        params drbd_resource="appyul1slash" \
        operations $id="appyul1slash-DRBD-ops" \
        op monitor interval="20s" role="Master" timeout="300s" \
        op monitor interval="30s" role="Slave" timeout="300s"
primitive appyul1swap-DRBD ocf:linbit:drbd \
        params drbd_resource="appyul1swap" \
        operations $id="appyul1swap-DRBD-ops" \
        op monitor interval="20s" role="Master" timeout="300s" \
        op monitor interval="30s" role="Slave" timeout="300s"
ms appyul1slash-MS appyul1slash-DRBD \
meta master-max="2" notify="true" interleave="true" target-role="Started" is-managed="true"
ms appyul1swap-MS appyul1swap-DRBD \
meta master-max="2" notify="true" interleave="true" target-role="Started" is-managed="true" order appyul1-after-drbd inf: appyul1slash-MS:promote appyul1swap-MS:promote appyul1:start

So to summerize:
- A  resource for Xen
- Two Master/Slave DRBD ressources for the VM filesystem (/ and swap). master-max is set to 2 to have both node in primary DRBD state.
- a "order" directive to start the VM after drbd has been promoted.

Node startup is ok, the VM is started after DRBD is promoted.

Node shutdown is problematic. Assuming the Xen VM runs on node A :
- When puting node A in standby when node B is active, a live migration is started, BUT in the same second, pacemaker tries to demote DRBD volumes on A (while live migration is in progress). - When putting node A in standby when node B is also in standby, the VM is stopped, BUT in the same second, pacemaker tries to demote DRBD volumes on A (while shutdown is still in progress).

All this results in "failed actions" in the CRM, and cause unwanted stonith actions (when enabled). I tried to add "symmetrical=false" on the order constraint, but it did not help.

I do not understand by pacemaker does not wait the Xen VM is stopped/migrated before demoting DRBD volumes.

Setup is done with corosync and pacemaker packages available on a standard Ubuntu Lucid (corosync 1.2.0 and pacemaker 1.0.8).

Thanks for your help,

Pierre

* *
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Reply via email to