Hi Guys, I was hoping that someone might be able to do a quick review of my cluster config, shown below, to work out why when I shutdown corosync on the master all the resources failover to the slave without a problem, but if I shutdown corosync on the slave all of the resources on the master stop as well, leaving me with both nodes broken. Obviously what I want to happen is to be able to shutdown corosync on the slave and all of the resources running on the master to remain untouched. I must have something not quite right in the logic of my cluster config.
It is a two server cluster running DRBD in active/passive mode. The servers are running Redhat 5.7 with corosync-1.2.7-1.1.el5 and pacemaker-1.0.11-1.2.el5: root@mq102:~# crm configure show node mq101.back.live.telhc.local node mq102.back.live.telhc.local primitive activemq_drbd ocf:linbit:drbd \ params drbd_resource="r0" \ op monitor interval="15s" timeout="20s" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" primitive activemq-emp lsb:activemq-emp \ op monitor interval="30s" timeout="30s" \ op stop interval="0" timeout="60s" \ op start interval="0" timeout="60s" \ meta target-role="Started" primitive cluster_IP ocf:heartbeat:IPaddr2 \ params ip="172.23.68.61" nic="eth0" \ op monitor interval="30s" timeout="90" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="100" primitive drbd_fs ocf:heartbeat:Filesystem \ params device="/dev/drbd1" directory="/drbd" fstype="ext3" \ op monitor interval="15s" timeout="40s" \ op start interval="0" timeout="60" \ op stop interval="0" timeout="60" primitive ping_gateway ocf:pacemaker:ping \ params name="ping_gateway" host_list="172.23.68.1" multiplier="100" \ op monitor interval="15s" timeout="20s" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="100" ms ActiveMQ_Data activemq_drbd \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" target-role="Master" clone ping_gateway_clone ping_gateway location ActiveMQ_Data_on_connected_node_only ActiveMQ_Data \ rule $id="ActiveMQ_Data_on_connected_node_only-rule" -inf: not_defined ping_gateway or ping_gateway lte 0 location ActiveMQ_Data_prefer_mq101 ActiveMQ_Data \ rule $id="ActiveMQ_Data_prefer_mq101-rule" $role="Master" 500: #uname eq mq101.back.live.telhc.local colocation activemq-emp_with_ActiveMQ_Data inf: activemq-emp ActiveMQ_Data:Master colocation cluster_IP_with_ActiveMQ_Data inf: cluster_IP ActiveMQ_Data:Master colocation drbd_fs_with_ActiveMQ_Data inf: drbd_fs ActiveMQ_Data:Master order ActiveMQ_Data_after_ping_gateway_clone inf: ping_gateway_clone:start ActiveMQ_Data:promote order activemq-emp_after_drbd_fs inf: drbd_fs:start activemq-emp:start order cluster_IP_after_drbd_fs inf: drbd_fs:start cluster_IP:start order drbd_fs_after_ActiveMQ_Data inf: ActiveMQ_Data:promote drbd_fs:start property $id="cib-bootstrap-options" \ dc-version="1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ no-quorum-policy="ignore" \ stonith-enabled="false" \ last-lrm-refresh="1317808706" rsc_defaults $id="rsc-options" \ resource-stickiness="100" Cheers, Tom
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker