Dear all,
I am a new member to this mailing list. Please let me know if the explanation
is not clear enough.
I setup a Centos 5.4 cluster environment (2 nodes, alpha1 and alpha2) with the
following software:
Corosync 1.3.0
Pacemaker 1.0.10.
Drbd 8.3.9
The environment is constructed as Active/Passive cluster mode based on
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf.
I setup four resources ( IP, DRBD, FileSystem, Apache) and want to test
different failover situations.
When I kill the corosync process at Active host, the Pacemaker seems fail to
move DRBD:Master to the original Passive host, said Alpha2.
Corosync and DRBD configuration files are attached in this mail, and the crm
configuration is listed below
=====================================================================================
node alpha1
node alpha2
primitive ClusterIP ocf:heartbeat:IPaddr2 \
params ip="192.168.75.10" cidr_netmask="32" \
op monitor interval="10s"
primitive Disk ocf:linbit:drbd \
params drbd_resource="ccmadata" \
op monitor interval="60s"
primitive FS ocf:heartbeat:Filesystem \
params device="/dev/drbd0" directory="/var/www/html" fstype="
ext3"
primitive WebSite ocf:heartbeat:apache \
params configfile="/etc/httpd/conf/httpd.conf" \
op monitor interval="1min"
ms DiskClone Disk \
meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true"
colocation drbd-with-ip inf: ClusterIP DiskClone:Master
colocation fs-on-drbd inf: FS DiskClone:Master
colocation website-with-fs inf: WebSite FS
order DiskClone-after-IP inf: DiskClone:promote ClusterIP:start
order FS-after-DiskClone inf: DiskClone:promote FS:start
order WebSite-after-FS inf: FS:start WebSite:start
property $id="cib-bootstrap-options" \
dc-version="1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
=====================================================================================
The first abnormal monitoring message by crm_mon command is
=====================================================================================
Last updated: Thu Mar 17 18:19:04 2011
Stack: openais
Current DC: alpha2 - partition WITHOUT quorum
Version: 1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3
2 Nodes configured, 2 expected votes
4 Resources configured.
============
Online: [ alpha2 ]
OFFLINE: [ alpha1 ]
Master/Slave Set: DiskClone
Slaves: [ alpha2 ]
Stopped: [ Disk:0 ]
=====================================================================================
The last abnormal monitoring message is
=====================================================================================
============
Last updated: Thu Mar 17 18:20:01 2011
Stack: openais
Current DC: alpha2 - partition WITHOUT quorum
Version: 1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3
2 Nodes configured, 2 expected votes
4 Resources configured.
============
Online: [ alpha2 ]
OFFLINE: [ alpha1 ]
Master/Slave Set: DiskClone
Slaves: [ alpha2 ]
Stopped: [ Disk:1 ]
Failed actions:
Disk:1_promote_0 (node=alpha2, call=12, rc=-2, status=Timed Out):
unknown ex
ec error
Disk:0_promote_0 (node=alpha2, call=22, rc=-2, status=Timed Out):
unknown ex
ec error
=====================================================================================
Corosync log on host Alpha1 is drbd_test_alpha1.log, and that on hoat Alpha2 is
drbd_test_alpha2.log
My questions are:
1) How to solve this issue? Do I miss some crm configuration for this
situation?
2) According to corosync log on host Alpha2, Pacemaker wants to prompt 2
DRBD masters (Please correct me if I am wrong). The action is failed because
the operation mode is set as Active/Passive mode and only 1 DRBD master is
allowed to exist. Should I add additional crm or drbd.conf configurations?
3) I am still study STONITH. Is my question a split-brain issue?
Thanks for your help.
BR,
Chia-Feng Kang
====================================================================
本信件可能包含工研院機密資訊,非指定之收件者,請勿使用或揭露本信件內容,並請銷毀此信件。
This email may contain confidential information. Please do not use or disclose
it in any way and delete it if you are not the intended recipient.
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker