Have you seen: http://www.clusterlabs.org/doc/crm_fencing.html
Should answer most of your questions On Thu, Feb 4, 2010 at 11:43 AM, Tom Pride <tom.pr...@gmail.com> wrote: > Hi there, > > I have successfully configured a 2 node DRBD pacemaker cluster using the > instructions provided by LINBIT here: > http://www.drbd.org/users-guide-emb/ch-pacemaker.html. The cluster works > perfectly and I can migrate the resources back and forth between the two > nodes without a problem. However, when simulating certain cluster > communication failures, I am having problems preventing the DRBD cluster > from entering a split brain state. I have been led to believe that STONITH > will help prevent split brain situations, but the LINBIT instructions do not > provide any guidance on how to conifgure STONITH in the pacemaker cluster. > The only thing I can find in LINBITs documentation is where it talks about > the resource fencing options within the /etc/drbd.conf of which I have > configured: > > > resource r0 { > disk { > fencing resource-only; > } > handlers { > fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; > after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh"; > } > > I'm still at a loss to understand what actually triggers DRBD to run the > above fencing scripts or how to tell if it has run them. > > I've searched the internet high and low for example pacemaker configs that > show you how to configure STONITH resources for DRBD, but I can't find > anything useful. > > Whilst hunting the Internet I did find this howto: ( > http://www.howtoforge.com/installation-and-setup-guide-for-drbd-openais-pacemaker-xen-on-opensuse-11.1 > ) that spells out how to configure a DRBD pacemaker cluster and even states > the following: "STONITH is disabled in this [example] configuration though > it is highly-recommended in any production environment to eliminate the risk > of divergent data." Infuriatingly it doesn't tell you how to configure > STONITH! > > Could someone you please, please, please give me some pointers or some > helpful examples on how I go about configuring STONITH and or modifying my > pacemaker configuration in any other ways to get it into a production ready > state? My current configuration is listed below: > > The cluster is built on 2 Redhat EL 5.3 servers running the following > software versions: > drbd-8.3.6-1 > pacemaker-1.0.5-4.1 > openais-0.80.5-15.1 > > > r...@mq001:~# crm configure show > node mq001.back.live.cwwtf.local > node mq002.back.live.cwwtf.local > primitive activemq-emp lsb:bbc-activemq-emp > primitive activemq-forge-services lsb:bbc-activemq-forge- > services > primitive activemq-social lsb:activemq-social > primitive drbd_activemq ocf:linbit:drbd \ > params drbd_resource="r0" \ > op monitor interval="15s" > primitive fs_activemq ocf:heartbeat:Filesystem \ > params device="/dev/drbd1" directory="/drbd" fstype="ext3" > primitive ip_activemq ocf:heartbeat:IPaddr2 \ > params ip="172.23.8.71" nic="eth0" > group activemq fs_activemq ip_activemq activemq-forge-services activemq-emp > activemq-social > ms ms_drbd_activemq drbd_activemq \ > meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" > notify="true" > colocation activemq_on_drbd inf: activemq ms_drbd_activemq:Master > order activemq_after_drbd inf: ms_drbd_activemq:promote activemq:start > property $id="cib-bootstrap-options" \ > dc-version="1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > no-quorum-policy="ignore" \ > last-lrm-refresh="1260809203" > > /etc/drbd.conf > > global { > usage-count no; > } > common { > protocol C; > } > resource r0 { > disk { > fencing resource-only; > } > handlers { > fence-peer "/usr/lib/drbd/crm-fence-peer. > sh"; > after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh"; > } > syncer { > rate 40M; > } > on mq001.back.live.cwwtf.local { > device /dev/drbd1; > disk /dev/cciss/c0d0p1; > address 172.23.8.69:7789; > meta-disk internal; > } > on mq002.back.live.cwwtf.local { > device /dev/drbd1; > disk /dev/cciss/c0d0p1; > address 172.23.8.70:7789; > meta-disk internal; > } > } > > > r...@mq001:~# cat /etc/ais/openais.conf > totem { > version: 2 > token: 3000 > token_retransmits_before_loss_const: 10 > join: 60 > consensus: 1500 > vsftype: none > max_messages: 20 > clear_node_high_bit: yes > secauth: on > threads: 0 > rrp_mode: passive > interface { > ringnumber: 0 > bindnetaddr: 172.59.60.0 > mcastaddr: 239.94.1.1 > mcastport: 5405 > } > interface { > ringnumber: 1 > bindnetaddr: 172.23.8.0 > mcastaddr: 239.94.2.1 > mcastport: 5405 > } > } > logging { > to_stderr: yes > debug: on > timestamp: on > to_file: no > to_syslog: yes > syslog_facility: daemon > } > amf { > mode: disabled > } > service { > ver: 0 > name: pacemaker > use_mgmtd: yes > } > aisexec { > user: root > group: root > } > > Many Thanks, > Tom > > > > _______________________________________________ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > _______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker