On 01/08/2013, at 10:24 PM, Xzarth <xza...@gmail.com> wrote: > Hi, > > I updated from pacemaker 1.0.9 to 1.1.7
Distro? Seems strange to be upgrading to a release from 1.5 years ago. We're up to 1.1.10 now > After the update, cluster behaves differently than before. I have a > resource with migration-treshold="1", once that resource fails > everything used to migrate to another node (what i would expect). > After the upgrade, once that resource fails, cluster stops any resources > that depend on that resource and just hangs there. What changed, since i > haven't touched the config? Can you attach the result of cibadmin -Ql when the cluster is in this state? > > > Here is the config: > > node $id="1bb92e1d" asttest1 \ > attributes standby="off" > node $id="5e583c54" asttest2 \ > attributes standby="off" > node asttest1 > node asttest2 > primitive asterisk lsb:asterisk-11.0.1 \ > op start interval="0" timeout="15s" \ > op stop interval="0" timeout="15s" \ > op monitor interval="1s" timeout="15s" start-delay="10" > primitive dahdi lsb:dahdi \ > op start interval="0" timeout="15s" \ > op stop interval="0" timeout="15s" \ > op monitor interval="1s" timeout="15s" > primitive drbd ocf:linbit:drbd \ > params drbd_resource="r0" \ > op monitor interval="29s" role="Master" \ > op monitor interval="31s" role="Slave" > primitive fonulator lsb:fonulator \ > op start interval="0" timeout="20s" \ > op stop interval="0" timeout="20s" \ > op monitor interval="1s" timeout="20s" start-delay="30" \ > meta migration-threshold="1" failure-timeout="60s" > primitive fs_drbd ocf:heartbeat:Filesystem \ > params device="/dev/drbd/by-res/r0" directory="/mnt/drbd" fstype="ext3" > \ > op start interval="0" timeout="60s" start-delay="1" \ > op stop interval="0" timeout="60s" start-delay="1" \ > op monitor interval="1s" timeout="40s" start-delay="30" \ > meta is-managed="true" target-role="Started" > primitive httpd lsb:apache2 \ > op start interval="0" timeout="20s" \ > op stop interval="0" timeout="20s" \ > op monitor interval="1s" timeout="20s" start-delay="10" > primitive iax2_mon lsb:iax2_mon \ > op start interval="0" timeout="20s" \ > op stop interval="0" timeout="20s" \ > op monitor interval="60s" timeout="20s" start-delay="30" \ > meta failure-timeout="60s" > primitive ip_voip_route_default ocf:heartbeat:Route \ > params destination="default" gateway="10.2.4.1" \ > op monitor interval="1s" timeout="20s" > primitive ip_voip_route_test1 ocf:heartbeat:Route \ > params destination="X.X.X.X/32" gateway="X.X.X.X" \ > op monitor interval="1s" timeout="20s" > primitive ip_voip_route_test2 ocf:heartbeat:Route \ > params destination="X.X.X.X/32" gateway="X.X.X.X.1" \ > op monitor interval="1s" timeout="20s" > primitive ip_voip_eth0 ocf:heartbeat:IPaddr2 \ > params ip="X.X.X.X" cidr_netmask="24" nic="eth0" iflabel="1" \ > op monitor interval="1s" timeout="20s" > primitive ip_voip_eth1 ocf:heartbeat:IPaddr2 \ > params ip="X.X.X.X" cidr_netmask="24" nic="eth0" iflabel="2" \ > op monitor interval="1s" timeout="20s" > primitive ip_voip_eth2 ocf:heartbeat:IPaddr2 \ > params ip="X.X.X.X" cidr_netmask="24" nic="eth0" iflabel="3" \ > op monitor interval="1s" timeout="20s" > primitive ip_voip_eth3 ocf:heartbeat:IPaddr2 \ > params ip="X.X.X.X" cidr_netmask="24" nic="eth0" iflabel="4" \ > op monitor interval="1s" timeout="20s" > primitive ip_voip_eth4 ocf:heartbeat:IPaddr2 \ > params ip="X.X.X.X" cidr_netmask="24" nic="eth0" iflabel="5" \ > op monitor interval="1s" timeout="20s" > primitive ip_voip_eth5 ocf:heartbeat:IPaddr2 \ > params ip="X.X.X.X" cidr_netmask="24" nic="eth0" iflabel="6" \ > op monitor interval="1s" timeout="20s" > primitive ip_voip_eth6 ocf:heartbeat:IPaddr2 \ > params ip="X.X.X.X" cidr_netmask="24" nic="eth0" iflabel="7" \ > op monitor interval="1s" timeout="20s" > primitive ip_voip_eth8 ocf:heartbeat:IPaddr2 \ > params ip="X.X.X.X" cidr_netmask="24" nic="eth8" iflabel="1" \ > op monitor interval="1s" timeout="20s" > primitive mysqld lsb:mysql \ > op monitor interval="1s" timeout="15s" start-delay="10" > primitive tftp lsb:tftp-srce \ > op start interval="0" timeout="20s" \ > op stop interval="0" timeout="20s" \ > op monitor interval="60s" timeout="10s" start-delay="10" > group ip_voip_addresses_p ip_voip_eth0 ip_voip_eth8 ip_voip_eth1 > ip_voip_eth2 ip_voip_eth3 ip_voip_eth4 ip_voip_eth5 ip_voip_eth6 \ > meta ordered="false" collocated="true" priority="8" > group ip_voip_routes ip_voip_route_test1 ip_voip_route_test2 \ > meta ordered="false" collocated="true" priority="9" > group voip mysqld dahdi fonulator asterisk iax2_mon httpd tftp \ > meta ordered="true" collocated="true" priority="10" > ms ms_drbd drbd \ > meta master-max="1" master-node-max="1" clone-max="2" > clone-node-max="1" notify="true" target-role="Master" > clone cl_route ip_voip_route_default \ > meta target-role="Started" > colocation fs_colocation inf: fs_drbd ms_drbd:Master > colocation ip_colocation inf: ip_voip_addresses_p fs_drbd > colocation ip_route_colocation inf: ip_voip_routes ip_voip_addresses_p > colocation voip_colocation inf: voip ip_voip_addresses_p > order fs_order inf: ms_drbd:promote fs_drbd:start > order ip_order inf: fs_drbd:start ip_voip_addresses_p:start > order ip_route_order inf: ip_voip_addresses_p:start ip_voip_routes:start > order voip_order inf: ip_voip_routes:start voip:start > property $id="cib-bootstrap-options" \ > dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \ > cluster-infrastructure="openais" \ > stonith-enabled="false" \ > expected-quorum-votes="2" \ > last-lrm-refresh="1375355273" \ > no-quorum-policy="ignore" \ > symmetric-cluster="true" > > > And here is the state of the cluster after node fails: > > ============ > Last updated: Thu Aug 1 13:26:41 2013 > Last change: Thu Aug 1 13:07:53 2013 > Stack: openais > Current DC: asttest1 - partition with quorum > Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff > 4 Nodes configured, 2 expected votes > 24 Resources configured. > ============ > > Online: [ asttest1 asttest2 ] > OFFLINE: [ asttest1 asttest2 ] > > Resource Group: voip > mysqld (lsb:mysql): Started asttest1 > dahdi (lsb:dahdi): Started asttest1 > fonulator (lsb:fonulator): Stopped > asterisk (lsb:asterisk-11.0.1): Stopped > iax2_mon (lsb:iax2_mon): Stopped > httpd (lsb:apache2): Stopped > tftp (lsb:tftp-srce): Stopped > Resource Group: ip_voip_routes > ip_voip_route_test1 (ocf::heartbeat:Route): Started asttest1 > ip_voip_route_test2 (ocf::heartbeat:Route): Started asttest1 > Resource Group: ip_voip_addresses_p > ip_voip_eth0 (ocf::heartbeat:IPaddr2): Started asttest1 > ip_voip_eth8 (ocf::heartbeat:IPaddr2): Started asttest1 > ip_voip_eth1 (ocf::heartbeat:IPaddr2): Started asttest1 > ip_voip_eth2 (ocf::heartbeat:IPaddr2): Started asttest1 > ip_voip_eth3 (ocf::heartbeat:IPaddr2): Started asttest1 > ip_voip_eth4 (ocf::heartbeat:IPaddr2): Started asttest1 > ip_voip_eth5 (ocf::heartbeat:IPaddr2): Started asttest1 > ip_voip_eth6 (ocf::heartbeat:IPaddr2): Started asttest1 > Clone Set: cl_route [ip_voip_route_default] > Started: [ asttest2 asttest1 ] > Stopped: [ ip_voip_route_default:2 ip_voip_route_default:3 ] > fs_drbd (ocf::heartbeat:Filesystem): Started asttest1 > Master/Slave Set: ms_drbd [drbd] > Masters: [ asttest1 ] > Slaves: [ asttest2 ] > > Failed actions: > fonulator_monitor_1000 (node=asttest1, call=85, rc=7, > status=complete): not running > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org