Hi guys, I'm facing weird problem, I'm not sure if anyone else has seen this.
Basically I have a pair and when I do a hard shutdown like ---"ipmitool chassis power off" --- of primary then secondary is just sitting as it is and drbd is not becoming master on that box. Some info about my environment: Pacemaker version -- Version: 1.1.2 DRBD - Version: 8.3.8.1 Corosync -- 1.2.8 When I look at the status section I see that despite the hard shutdown node node status is not updated properly by lrmd: I'm just mentioning the node state and transient attributes pieces of the status part of cibconfig: <node_state uname="C725.elab.itactics.com" ha="active" in_ccm="true" crmd="online" expected="member" shutdown="0" join="member" id="C725.elab.itactics.com" crm-debug-origin="do_update_resource"> <transient_attributes id="C725.elab.itactics.com"> <instance_attributes id="status-C725.elab.itactics.com"> <nvpair id="status-C725.elab.itactics.com-probe_complete" name="probe_complete" value="true"/> <nvpair name="master-drbd0:0" id="status-C725.elab.itactics.com-master-drbd0:0" value="10000"/> <nvpair id="status-C725.elab.itactics.com-pingd" name="pingd" value="1000"/> </instance_attributes> </transient_attributes> <node_state uname="C726.elab.itactics.com" crmd="online" ha="active" in_ccm="false" join="pending" expected="member" shutdown="0" id="C726.elab.itactics.com" crm-debug-origin="do_state_transition"> <transient_attributes id="C726.elab.itactics.com"> <instance_attributes id="status-C726.elab.itactics.com"> <nvpair id="status-C726.elab.itactics.com-probe_complete" name="probe_complete" value="true"/> <nvpair name="master-drbd0:1" id="status-C726.elab.itactics.com-master-drbd0:1" value="10000"/> <nvpair id="status-C726.elab.itactics.com-pingd" name="pingd" value="1000"/> </instance_attributes> </transient_attributes> crm_mon ============ Last updated: Thu Jun 9 11:40:52 2011 Stack: openais Current DC: C725.elab.itactics.com - partition WITHOUT quorum Version: 1.1.2-e0d731c2b1be446b27a73327a53067bf6230fb6a 2 Nodes configured, 2 expected votes 7 Resources configured. ============ Node C726.elab.itactics.com: UNCLEAN (offline) Online: [ C725.elab.itactics.com ] Clone Set: connectivity [ping] Started: [ C725.elab.itactics.com ] Stopped: [ ping:0 ] Master/Slave Set: ms-drbd [drbd0] Slaves: [ C725.elab.itactics.com ] Stopped: [ drbd0:0 ] C726.elab.itactics.com-stonith (stonith:external/safe/ipmi): Started C725.elab.itactics.com So for C726 even though in_ccm=false but rest of it is just like as if it is online. Why crmd has not been able to update this information properly ? I'm assuming it is because of this situation secondary remains slave and never gets promoted to master. because in transient attributes section there is nothing preventing it to become master. Thanks Shravan _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker