On Thu, Mar 3, 2011 at 7:05 AM, AP <pacema...@inml.weebeastie.net> wrote: > Hi, > > Having deep issues with my cluster setup. Everything works ok until > I add a VirtualDomain RA in. Then things go pearshaped in that it seems > to ignore the "order" crm config for it and starts as soon as it can. > > The crm config is provided below. Basically p-vd_vg.test1 attempts to > start despite p-libvirtd not being started and p-drbd_vg.test1 not > being master (or slave for that matter - ie it's not configured at all). > > Eventually p-libvirtd and p-drbd_vg.test1 start and p-vd_vg.test1 attempts > to, pengine on the node where p-vd_vg.test1 is already running complains > with: > > Mar 3 16:49:16 breadnut pengine: [2097]: ERROR: native_create_actions: > Resource p-vd_vg.test1 (ocf::VirtualDomain) is active on 2 nodes attempting > recovery > Mar 3 16:49:16 breadnut pengine: [2097]: WARN: See > http://clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more information.
Well... did you read that link? > > Then mass slaughter occurs and p-vd_vg.test1 is restarted where it was > running previously whilst the other node gets an error for it. > > Essentially I cannot restart the 2nd node without it breaking the 1st. > > Now, as I understand it, a lone primitive will run once on any node - this > is just fine by me. > > colo-vd_vg.test1 indicates that p-vd_vg.test1 should run where > ms-drbd_vg.test1 > is master. ms-drbd_vg.test1 should only be master where clone-libvirtd is > started. > > order-vg.test1 indicates that ms-drbd_vg.test1 should start after clone-lvm_gh > is started (successfully). (This used to have a promote for ms-drbd_vg.test1 > but then ms-drbd_vg.test1 would be demoted and not stopped on shutdown which > would cause clone-lvm_gh to error out on stop) > > order-vd_vg.test1 indicates p-vd_vg.test1 should only start where > ms-drbd_vg.test1 and clone-libvirtd have both successfully started (the > order of their starting being irrelevant). > > cli-standby-p-vd_vg.test1 was put there by my migrating p-vd_vg.test1 > about the place. > > This happens with or without fencing and with fencing configured as below > or as just a single primited with both nodes in the hostlist. > > Help with this would be awesome and appreciated. I do not know what I am > missing here. The config makes sense to me so I don't even know where > to start poking and prodding. I be flailing. > > Config and s/w version list is below: > > OS: Debian Squeeze > Kernel: 2.6.37.2 > > PACKAGES: > > ii cluster-agents 1:1.0.4-0ubuntu1~custom1 The > reusable cluster components for Linux HA > ii cluster-glue 1.0.7-3ubuntu1~custom1 The > reusable cluster components for Linux HA > ii corosync 1.3.0-1ubuntu1~custom1 > Standards-based cluster framework (daemon and modules) > ii libccs3 3.1.0-0ubuntu1~custom1 Red Hat > cluster suite - cluster configuration libraries > ii libcib1 1.1.5-0ubuntu1~ppa1~custom1 The > Pacemaker libraries - CIB > ii libcman3 3.1.0-0ubuntu1~custom1 Red Hat > cluster suite - cluster manager libraries > ii libcorosync4 1.3.0-1ubuntu1~custom1 > Standards-based cluster framework (libraries) > ii libcrmcluster1 1.1.5-0ubuntu1~ppa1~custom1 The > Pacemaker libraries - CRM > ii libcrmcommon2 1.1.5-0ubuntu1~ppa1~custom1 The > Pacemaker libraries - common CRM > ii libfence4 3.1.0-0ubuntu1~custom1 Red Hat > cluster suite - fence client library > ii liblrm2 1.0.7-3ubuntu1~custom1 Reusable > cluster libraries -- liblrm2 > ii libpe-rules2 1.1.5-0ubuntu1~ppa1~custom1 The > Pacemaker libraries - rules for P-Engine > ii libpe-status3 1.1.5-0ubuntu1~ppa1~custom1 The > Pacemaker libraries - status for P-Engine > ii libpengine3 1.1.5-0ubuntu1~ppa1~custom1 The > Pacemaker libraries - P-Engine > ii libpils2 1.0.7-3ubuntu1~custom1 Reusable > cluster libraries -- libpils2 > ii libplumb2 1.0.7-3ubuntu1~custom1 Reusable > cluster libraries -- libplumb2 > ii libplumbgpl2 1.0.7-3ubuntu1~custom1 Reusable > cluster libraries -- libplumbgpl2 > ii libstonith1 1.0.7-3ubuntu1~custom1 Reusable > cluster libraries -- libstonith1 > ii libstonithd1 1.1.5-0ubuntu1~ppa1~custom1 The > Pacemaker libraries - stonith > ii libtransitioner1 1.1.5-0ubuntu1~ppa1~custom1 The > Pacemaker libraries - transitioner > ii pacemaker 1.1.5-0ubuntu1~ppa1~custom1 HA > cluster resource manager > > CONFIG: > > node breadnut > node breadnut2 \ > attributes standby="off" > primitive fencing-bn stonith:meatware \ > params hostlist="breadnut" \ > op start interval="0" timeout="60s" \ > op stop interval="0" timeout="70s" \ > op monitor interval="10" timeout="60s" > primitive fencing-bn2 stonith:meatware \ > params hostlist="breadnut2" \ > op start interval="0" timeout="60s" \ > op stop interval="0" timeout="70s" \ > op monitor interval="10" timeout="60s" > primitive p-drbd_vg.test1 ocf:linbit:drbd \ > params drbd_resource="vg.test1" \ > operations $id="ops-drbd_vg.test1" \ > op start interval="0" timeout="240s" \ > op stop interval="0" timeout="100s" \ > op monitor interval="20" role="Master" timeout="20s" \ > op monitor interval="30" role="Slave" timeout="20s" > primitive p-libvirtd ocf:local:libvirtd \ > meta allow-migrate="off" \ > op start interval="0" timeout="200s" \ > op stop interval="0" timeout="100s" \ > op monitor interval="10" timeout="200s" > primitive p-lvm_gh ocf:heartbeat:LVM \ > params volgrpname="gh" \ > meta allow-migrate="off" \ > op start interval="0" timeout="90s" \ > op stop interval="0" timeout="100s" \ > op monitor interval="10" timeout="100s" > primitive p-vd_vg.test1 ocf:heartbeat:VirtualDomain \ > params config="/etc/libvirt/qemu/vg.test1.xml" \ > params migration_transport="tcp" \ > meta allow-migrate="true" is-managed="true" \ > op start interval="0" timeout="120s" \ > op stop interval="0" timeout="120s" \ > op migrate_to interval="0" timeout="120s" \ > op migrate_from interval="0" timeout="120s" \ > op monitor interval="10s" timeout="120s" > ms ms-drbd_vg.test1 p-drbd_vg.test1 \ > meta resource-stickines="100" notify="true" master-max="2" > target-role="Master" > clone clone-libvirtd p-libvirtd \ > meta interleave="true" > clone clone-lvm_gh p-lvm_gh \ > meta interleave="true" > location cli-standby-p-vd_vg.test1 p-vd_vg.test1 \ > rule $id="cli-standby-rule-p-vd_vg.test1" -inf: #uname eq breadnut2 > location loc-fencing-bn fencing-bn -inf: breadnut > location loc-fencing-bn2 fencing-bn2 -inf: breadnut2 > colocation colo-vd_vg.test1 inf: p-vd_vg.test1:Started > ms-drbd_vg.test1:Master clone-libvirtd:Started > order order-vd_vg.test1 inf: ( ms-drbd_vg.test1:start clone-libvirtd:start ) > p-vd_vg.test1:start > order order-vg.test1 inf: clone-lvm_gh:start ms-drbd_vg.test1:start > property $id="cib-bootstrap-options" \ > dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \ > cluster-infrastructure="openais" \ > default-resource-stickiness="1000" \ > stonith-enabled="true" \ > expected-quorum-votes="2" \ > no-quorum-policy="ignore" \ > last-lrm-refresh="1299128317" > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker