Hello I think this constrains it's wrong
====================================== colocation c_drbd_libvirt_vm inf: ms_drbd_vmstore:Master ms_drbd_mount1:Master ms_drbd_mount2:Master g_vm ====================================== it must be like this ====================================== colocation c_drbd_libvirt_vm inf: g_vm ms_drbd_vmstore:Master ms_drbd_mount1:Master ms_drbd_mount2:Master Il giorno 24 marzo 2012 20:15, Andrew Martin <amar...@xes-inc.com> ha scritto: > Hi Andreas, > > My complete cluster configuration is as follows: > ============ > Last updated: Sat Mar 24 13:51:55 2012 > Last change: Sat Mar 24 13:41:55 2012 > Stack: Heartbeat > Current DC: node2 (9100538b-7a1f-41fd-9c1a-c6b4b1c32b18) - partition with > quorum > Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c > 3 Nodes configured, unknown expected votes > 19 Resources configured. > ============ > > Node quorumnode (c4bf25d7-a6b7-4863-984d-aafd937c0da4): OFFLINE (standby) > Online: [ node2 node1 ] > > Master/Slave Set: ms_drbd_vmstore [p_drbd_vmstore] > Masters: [ node2 ] > Slaves: [ node1 ] > Master/Slave Set: ms_drbd_mount1 [p_drbd_mount1] > Masters: [ node2 ] > Slaves: [ node1 ] > Master/Slave Set: ms_drbd_mount2 [p_drbd_mount2] > Masters: [ node2 ] > Slaves: [ node1 ] > Resource Group: g_vm > p_fs_vmstore (ocf::heartbeat:Filesystem): Started node2 > p_vm (ocf::heartbeat:VirtualDomain): Started node2 > Clone Set: cl_daemons [g_daemons] > Started: [ node2 node1 ] > Stopped: [ g_daemons:2 ] > Clone Set: cl_sysadmin_notify [p_sysadmin_notify] > Started: [ node2 node1 ] > Stopped: [ p_sysadmin_notify:2 ] > stonith-node1 (stonith:external/tripplitepdu): Started node2 > stonith-node2 (stonith:external/tripplitepdu): Started node1 > Clone Set: cl_ping [p_ping] > Started: [ node2 node1 ] > Stopped: [ p_ping:2 ] > > node $id="6553a515-273e-42fe-ab9e-00f74bd582c3" node1 \ > attributes standby="off" > node $id="9100538b-7a1f-41fd-9c1a-c6b4b1c32b18" node2 \ > attributes standby="off" > node $id="c4bf25d7-a6b7-4863-984d-aafd937c0da4" quorumnode \ > attributes standby="on" > primitive p_drbd_mount2 ocf:linbit:drbd \ > params drbd_resource="mount2" \ > op monitor interval="15" role="Master" \ > op monitor interval="30" role="Slave" > primitive p_drbd_mount1 ocf:linbit:drbd \ > params drbd_resource="mount1" \ > op monitor interval="15" role="Master" \ > op monitor interval="30" role="Slave" > primitive p_drbd_vmstore ocf:linbit:drbd \ > params drbd_resource="vmstore" \ > op monitor interval="15" role="Master" \ > op monitor interval="30" role="Slave" > primitive p_fs_vmstore ocf:heartbeat:Filesystem \ > params device="/dev/drbd0" directory="/vmstore" fstype="ext4" \ > op start interval="0" timeout="60s" \ > op stop interval="0" timeout="60s" \ > op monitor interval="20s" timeout="40s" > primitive p_libvirt-bin upstart:libvirt-bin \ > op monitor interval="30" > primitive p_ping ocf:pacemaker:ping \ > params name="p_ping" host_list="192.168.1.10 192.168.1.11" > multiplier="1000" \ > op monitor interval="20s" > primitive p_sysadmin_notify ocf:heartbeat:MailTo \ > params email="m...@example.com" \ > params subject="Pacemaker Change" \ > op start interval="0" timeout="10" \ > op stop interval="0" timeout="10" \ > op monitor interval="10" timeout="10" > primitive p_vm ocf:heartbeat:VirtualDomain \ > params config="/vmstore/config/vm.xml" \ > meta allow-migrate="false" \ > op start interval="0" timeout="120s" \ > op stop interval="0" timeout="120s" \ > op monitor interval="10" timeout="30" > primitive stonith-node1 stonith:external/tripplitepdu \ > params pdu_ipaddr="192.168.1.12" pdu_port="1" pdu_username="xxx" > pdu_password="xxx" hostname_to_stonith="node1" > primitive stonith-node2 stonith:external/tripplitepdu \ > params pdu_ipaddr="192.168.1.12" pdu_port="2" pdu_username="xxx" > pdu_password="xxx" hostname_to_stonith="node2" > group g_daemons p_libvirt-bin > group g_vm p_fs_vmstore p_vm > ms ms_drbd_mount2 p_drbd_mount2 \ > meta master-max="1" master-node-max="1" clone-max="2" > clone-node-max="1" notify="true" > ms ms_drbd_mount1 p_drbd_mount1 \ > meta master-max="1" master-node-max="1" clone-max="2" > clone-node-max="1" notify="true" > ms ms_drbd_vmstore p_drbd_vmstore \ > meta master-max="1" master-node-max="1" clone-max="2" > clone-node-max="1" notify="true" > clone cl_daemons g_daemons > clone cl_ping p_ping \ > meta interleave="true" > clone cl_sysadmin_notify p_sysadmin_notify > location l-st-node1 stonith-node1 -inf: node1 > location l-st-node2 stonith-node2 -inf: node2 > location l_run_on_most_connected p_vm \ > rule $id="l_run_on_most_connected-rule" p_ping: defined p_ping > colocation c_drbd_libvirt_vm inf: ms_drbd_vmstore:Master > ms_drbd_mount1:Master ms_drbd_mount2:Master g_vm > order o_drbd-fs-vm inf: ms_drbd_vmstore:promote ms_drbd_mount1:promote > ms_drbd_mount2:promote cl_daemons:start g_vm:start > property $id="cib-bootstrap-options" \ > dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \ > cluster-infrastructure="Heartbeat" \ > stonith-enabled="false" \ > no-quorum-policy="stop" \ > last-lrm-refresh="1332539900" \ > cluster-recheck-interval="5m" \ > crmd-integration-timeout="3m" \ > shutdown-escalation="5m" > > The STONITH plugin is a custom plugin I wrote for the Tripp-Lite > PDUMH20ATNET that I'm using as the STONITH device: > http://www.tripplite.com/shared/product-pages/en/PDUMH20ATNET.pdf > > As you can see, I left the DRBD service to be started by the operating > system (as an lsb script at boot time) however Pacemaker controls actually > bringing up/taking down the individual DRBD devices. The behavior I observe > is as follows: I issue "crm resource migrate p_vm" on node1 and failover > successfully to node2. During this time, node2 fences node1's DRBD devices > (using dopd) and marks them as Outdated. Meanwhile node2's DRBD devices are > UpToDate. I then shutdown both nodes and then bring them back up. They > reconnect to the cluster (with quorum), and node1's DRBD devices are still > Outdated as expected and node2's DRBD devices are still UpToDate, as > expected. At this point, DRBD starts on both nodes, however node2 will not > set DRBD as master: > Node quorumnode (c4bf25d7-a6b7-4863-984d-aafd937c0da4): OFFLINE (standby) > Online: [ node2 node1 ] > > Master/Slave Set: ms_drbd_vmstore [p_drbd_vmstore] > Slaves: [ node1 node2 ] > Master/Slave Set: ms_drbd_mount1 [p_drbd_mount1] > Slaves: [ node1 node 2 ] > Master/Slave Set: ms_drbd_mount2 [p_drbd_mount2] > Slaves: [ node1 node2 ] > > I am having trouble sorting through the logging information because there > is so much of it in /var/log/daemon.log, but I can't find an error message > printed about why it will not promote node2. At this point the DRBD devices > are as follows: > node2: cstate = WFConnection dstate=UpToDate > node1: cstate = StandAlone dstate=Outdated > > I don't see any reason why node2 can't become DRBD master, or am I missing > something? If I do "drbdadm connect all" on node1, then the cstate on both > nodes changes to "Connected" and node2 immediately promotes the DRBD > resources to master. Any ideas on why I'm observing this incorrect behavior? > > Any tips on how I can better filter through the pacemaker/heartbeat logs > or how to get additional useful debug information? > > Thanks, > > Andrew > > ------------------------------ > *From: *"Andreas Kurz" <andr...@hastexo.com> > *To: *pacemaker@oss.clusterlabs.org > *Sent: *Wednesday, 1 February, 2012 4:19:25 PM > *Subject: *Re: [Pacemaker] Nodes will not promote DRBD resources to > master on failover > > On 01/25/2012 08:58 PM, Andrew Martin wrote: > > Hello, > > > > Recently I finished configuring a two-node cluster with pacemaker 1.1.6 > > and heartbeat 3.0.5 on nodes running Ubuntu 10.04. This cluster includes > > the following resources: > > - primitives for DRBD storage devices > > - primitives for mounting the filesystem on the DRBD storage > > - primitives for some mount binds > > - primitive for starting apache > > - primitives for starting samba and nfs servers (following instructions > > here <http://www.linbit.com/fileadmin/tech-guides/ha-nfs.pdf>) > > - primitives for exporting nfs shares (ocf:heartbeat:exportfs) > > not enough information ... please share at least your complete cluster > configuration > > Regards, > Andreas > > -- > Need help with Pacemaker? > http://www.hastexo.com/now > > > > > Perhaps this is best described through the output of crm_mon: > > Online: [ node1 node2 ] > > > > Master/Slave Set: ms_drbd_mount1 [p_drbd_mount1] (unmanaged) > > p_drbd_mount1:0 (ocf::linbit:drbd): Started node2 > (unmanaged) > > p_drbd_mount1:1 (ocf::linbit:drbd): Started node1 > > (unmanaged) FAILED > > Master/Slave Set: ms_drbd_mount2 [p_drbd_mount2] > > p_drbd_mount2:0 (ocf::linbit:drbd): Master node1 > > (unmanaged) FAILED > > Slaves: [ node2 ] > > Resource Group: g_core > > p_fs_mount1 (ocf::heartbeat:Filesystem): Started node1 > > p_fs_mount2 (ocf::heartbeat:Filesystem): Started node1 > > p_ip_nfs (ocf::heartbeat:IPaddr2): Started node1 > > Resource Group: g_apache > > p_fs_mountbind1 (ocf::heartbeat:Filesystem): Started node1 > > p_fs_mountbind2 (ocf::heartbeat:Filesystem): Started node1 > > p_fs_mountbind3 (ocf::heartbeat:Filesystem): Started node1 > > p_fs_varwww (ocf::heartbeat:Filesystem): Started node1 > > p_apache (ocf::heartbeat:apache): Started node1 > > Resource Group: g_fileservers > > p_lsb_smb (lsb:smbd): Started node1 > > p_lsb_nmb (lsb:nmbd): Started node1 > > p_lsb_nfsserver (lsb:nfs-kernel-server): Started node1 > > p_exportfs_mount1 (ocf::heartbeat:exportfs): Started node1 > > p_exportfs_mount2 (ocf::heartbeat:exportfs): Started node1 > > > > I have read through the Pacemaker Explained > > < > http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained > > > > documentation, however could not find a way to further debug these > > problems. First, I put node1 into standby mode to attempt failover to > > the other node (node2). Node2 appeared to start the transition to > > master, however it failed to promote the DRBD resources to master (the > > first step). I have attached a copy of this session in commands.log and > > additional excerpts from /var/log/syslog during important steps. I have > > attempted everything I can think of to try and start the DRBD resource > > (e.g. start/stop/promote/manage/cleanup under crm resource, restarting > > heartbeat) but cannot bring it out of the slave state. However, if I set > > it to unmanaged and then run drbdadm primary all in the terminal, > > pacemaker is satisfied and continues starting the rest of the resources. > > It then failed when attempting to mount the filesystem for mount2, the > > p_fs_mount2 resource. I attempted to mount the filesystem myself and was > > successful. I then unmounted it and ran cleanup on p_fs_mount2 and then > > it mounted. The rest of the resources started as expected until the > > p_exportfs_mount2 resource, which failed as follows: > > p_exportfs_mount2 (ocf::heartbeat:exportfs): started node2 > > (unmanaged) FAILED > > > > I ran cleanup on this and it started, however when running this test > > earlier today no command could successfully start this exportfs > resource. > > > > How can I configure pacemaker to better resolve these problems and be > > able to bring the node up successfully on its own? What can I check to > > determine why these failures are occuring? /var/log/syslog did not seem > > to contain very much useful information regarding why the failures > occurred. > > > > Thanks, > > > > Andrew > > > > > > > > > > This body part will be downloaded on demand. > > > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > > -- esta es mi vida e me la vivo hasta que dios quiera
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org