On Tue, Sep 20, 2011 at 10:10 PM, Oualid Nouri <o.no...@computer-lan.de> wrote: > Hi, > > I’m testing pacemaker resource failover in a very simple test environment > with two virtual machines. > > 3 Cloned resources (drbd dualprimary), controld, clvm. > > Fencing with external/ssh that’s it. > > I‘m having problems understanding why my clvm resource gets restarted when > a failing node gets back online. > > > > When one node is powerd off (failtest) the remaining node fences the > “failing” node and the clvm-resource stays online. > > But when the failed node is back online the clvm resource clone on the > previously ”remaining “ node gets restarted without visible reason (see > logs)
Sep 20 13:18:41 tnode2 pengine: [3116]: notice: unpack_rsc_op: Operation res_drbd_1:1_monitor_0 found resource res_drbd_1:1 active on tnode1 When tnode1 came back online, the cluster found that drbd was already running. Do you have it configured to start at boot time? > > > > I gues doing something wrong! > > But what? > > Anyone who can point me in the right direction? > > > > > > Thank you! > > > > > > > > Sep 20 13:18:41 tnode2 crmd: [3121]: info: do_pe_invoke: Query 228: > Requesting the current CIB: S_POLICY_ENGINE > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: unpack_config: On loss of > CCM Quorum: Ignore > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: unpack_rsc_op: Operation > res_drbd_1:1_monitor_0 found resource res_drbd_1:1 active on tnode1 > > Sep 20 13:18:41 tnode2 crmd: [3121]: info: do_pe_invoke_callback: Invoking > the PE: query=228, ref=pe_calc-dc-1316517521-176, seq=1268, quorate=1 > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: unpack_rsc_op: Operation > res_drbd_1:0_monitor_0 found resource res_drbd_1:0 active on tnode2 > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: clone_print: Master/Slave > Set: ms_drbd_1 [res_drbd_1] > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: short_print: Masters: [ > tnode2 ] > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: short_print: Slaves: [ > tnode1 ] > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: clone_print: Clone Set: > cl_controld_1 [res_controld_dlm] > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: short_print: Started: [ > tnode2 ] > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: short_print: Stopped: [ > res_controld_dlm:1 ] > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: native_print: > stonith_external_ssh_1#011(stonith:external/ssh):#011Started tnode1 > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: native_print: > stonith_external_ssh_2#011(stonith:external/ssh):#011Started tnode2 > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: clone_print: Clone Set: > cl_clvmd_1 [res_clvmd_clustervg] > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: short_print: Started: [ > tnode2 ] > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: short_print: Stopped: [ > res_clvmd_clustervg:1 ] > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: RecurringOp: Start > recurring monitor (60s) for res_controld_dlm:1 on tnode1 > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: LogActions: Leave > res_drbd_1:0#011(Master tnode2) > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: LogActions: Promote > res_drbd_1:1#011(Slave -> Master tnode1) > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: LogActions: Leave > res_controld_dlm:0#011(Started tnode2) > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: LogActions: Start > res_controld_dlm:1#011(tnode1) > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: LogActions: Leave > stonith_external_ssh_1#011(Started tnode1) > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: LogActions: Leave > stonith_external_ssh_2#011(Started tnode2) > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: LogActions: Restart > res_clvmd_clustervg:0#011(Started tnode2) > > Sep 20 13:18:41 tnode2 pengine: [3116]: notice: LogActions: Start > res_clvmd_clustervg:1#011(tnode1) > > > > CONFIG > > > > node tnode1 \ > > attributes standby="off" > > node tnode2 \ > > attributes standby="off" > > primitive res_clvmd_clustervg ocf:lvm2:clvmd \ > > params daemon_timeout="30" \ > > operations $id="res_clvmd_clustervg-operations" \ > > op monitor interval="0" timeout="4min" start-delay="5" > > primitive res_controld_dlm ocf:pacemaker:controld \ > > operations $id="res_controld_dlm-operations" \ > > op monitor interval="60" timeout="60" start-delay="0" \ > > meta target-role="started" > > primitive res_drbd_1 ocf:linbit:drbd \ > > params drbd_resource="r0" \ > > operations $id="res_drbd_1-operations" \ > > op start interval="0" timeout="240" \ > > op promote interval="0" timeout="90" \ > > op demote interval="0" timeout="90" \ > > op stop interval="0" timeout="100" \ > > op monitor interval="10" timeout="20" start-delay="1min" \ > > op notify interval="0" timeout="90" \ > > meta target-role="started" is-managed="true" > > primitive stonith_external_ssh_1 stonith:external/ssh \ > > params hostlist="tnode2" \ > > operations $id="stonith_external_ssh_1-operations" \ > > op start interval="0" timeout="60" \ > > op stop interval="0" timeout="60" \ > > op monitor interval="60" timeout="60" start-delay="0" \ > > meta failure-timeout="3" > > primitive stonith_external_ssh_2 stonith:external/ssh \ > > params hostlist="tnode1" \ > > operations $id="stonith_external_ssh_2-operations" \ > > op start interval="0" timeout="60" \ > > op stop interval="0" timeout="60" \ > > op monitor interval="60" timeout="60" start-delay="0" \ > > meta target-role="started" failure-timeout="3" > > ms ms_drbd_1 res_drbd_1 \ > > meta master-max="2" clone-max="2" notify="true" ordered="true" > interleave="true" > > clone cl_clvmd_1 res_clvmd_clustervg \ > > meta clone-max="2" notify="true" > > clone cl_controld_1 res_controld_dlm \ > > meta clone-max="2" notify="true" ordered="true" interleave="true" > > location loc_ms_drbd_1-ping-prefer ms_drbd_1 \ > > rule $id="loc_ms_drbd_1-ping-prefer-rule" pingd: defined pingd > > location loc_stonith_external_ssh_1_tnode2 stonith_external_ssh_1 -inf: > tnode2 > > location loc_stonith_external_ssh_2_tnode1 stonith_external_ssh_2 -inf: > tnode1 > > colocation col_cl_controld_1_cl_clvmd_1 inf: cl_clvmd_1 cl_controld_1 > > colocation col_ms_drbd_1_cl_controld_1 inf: cl_controld_1 ms_drbd_1:Master > > order ord_cl_controld_1_cl_clvmd_1 inf: cl_controld_1 cl_clvmd_1 > > order ord_ms_drbd_1_cl_controld_1 inf: ms_drbd_1:promote cl_controld_1:start > > property $id="cib-bootstrap-options" \ > > expected-quorum-votes="2" \ > > stonith-timeout="30" \ > > dc-version="1.1.5-ecb6baaf7fc091b023d6d4ba7e0fce26d32cf5c8" \ > > no-quorum-policy="ignore" \ > > cluster-infrastructure="openais" \ > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker