Thanks for the quick answer. I'll have a look at that. Is there a way to manually force a failover when I can be sure the other machine is down?
Kind regards Felix -----Ursprüngliche Nachricht----- Von: Digimer [mailto:li...@alteeve.ca] Gesendet: Montag, 18. August 2014 19:57 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] no failover if fencing device is unreachable (i.e. power loss) On 18/08/14 01:50 PM, Felix Schrage wrote: > Hi, > > I'am building a two-node cluster running XenServer, pacemaker and DRBD. > There's a problem when testing the failover by powering off the current > active node. > When using the fence_xenapi agent, the resource ClusterIP will not be moved > to the 2nd node until the first node was successfully shut down. > However because the XenAPI is unreachable when the machine is powered off, > the 2nd node continuously is trying to shut down the node and the resource is > never moved. > > To check if it's an error with the fence_xenapi-agent I tried > fence_ipmilan which is working fine as long as the IPMI is is reachable. When > pulling the power cords from the machine however the behavior is the same as > with the fence_xenapi agent. > Am I missing an option which should be set? A timeout or a retry counter? This is the expected behaviour. Being unable to connect to the fence device (or to fail to confirm the "off" action) can not be treated as a successful fence. Without a successful fence, it can not be assumed that the peer is gone. To do so would be to risk a split-brain, so the cluster's only sane and safe option is to block. For this reason, this is why we always use switched PDUs as a backup fence method. You can see how to configure this with STONITH levels: http://clusterlabs.org/wiki/STONITH_Levels > Here's how I setup the cluster (fence_xenapi) using pcs: > > pcs cluster cib ftp_ha_cluster > pcs -f ftp_ha_cluster resource create ClusterIP IPaddr2 > ip=172.20.150.150 cidr_netmask=32 op monitor interval=20s pcs -f > ftp_ha_cluster constraint location ClusterIP prefers ftp-test01=50 pcs > -f ftp_ha_cluster stonith create xenvm-fence-ftp1 fence_xenapi > pcmk_host_list="ftp-test01" action="off" > session_url="https://test-xen-01" port="ftp-test01" login="root" > passwd="****" delay=15 op monitor interval=40s pcs -f ftp_ha_cluster > stonith create xenvm-fence-ftp2 fence_xenapi > pcmk_host_list="ftp-test02" action="off" > session_url="https://test-xen-02" port="ftp-test02" login="root" > passwd="****" delay=15 op monitor interval=40s pcs -f ftp_ha_cluster > constraint location xenvm-fence-ftp1 prefers ftp-test01=-INFINITY pcs > -f ftp_ha_cluster constraint location xenvm-fence-ftp2 prefers > ftp-test02=-INFINITY pcs -f ftp_ha_cluster property set > stonith-enabled=true pcs -f ftp_ha_cluster property set > stonith-action=off pcs -f ftp_ha_cluster property set > stonith-timeout=40s pcs -f ftp_ha_cluster property set > no-quorum-policy=ignore pcs -f ftp_ha_cluster resource create Ping > ocf:pacemaker:ping dampen="5s" multiplier="100" > host_list="172.20.150.1 172.20.150.151 172.20.150.152" attempts="3" op > monitor interval=20s pcs -f ftp_ha_cluster resource clone Ping pcs -f > ftp_ha_cluster constraint location ClusterIP rule score=-INF > not_defined pingd or pingd lte 0 pcs -f ftp_ha_cluster constraint > location ClusterIP rule score=pingd defined pingd pcs cluster cib-push > ftp_ha_cluster > > for testing with fence_ipmilan I replaced the appropriate lines with the > following: > > pcs -f ftp_ha_cluster stonith create ipmi-fence-test-xen-01 > fence_ipmilan pcmk_host_list="ftp-test01" action="off" > ipaddr="test-xen-01-bmc.mercateo.lan" auth="password" login="admin" > passwd="****" delay=15 op monitor interval=40s pcs -f ftp_ha_cluster > stonith create ipmi-fence-test-xen-02 fence_ipmilan > pcmk_host_list="ftp-test02" action="off" > ipaddr="test-xen-02-bmc.mercateo.lan" auth="password" login="admin" > passwd="****" delay=15 op monitor interval=40s pcs -f ftp_ha_cluster > constraint location ipmi-fence-test-xen-01 prefers > ftp-test01=-INFINITY pcs -f ftp_ha_cluster constraint location > ipmi-fence-test-xen-02 prefers ftp-test02=-INFINITY > > > the content of /etc/corosync/corosync.conf: > > compatibility: whitetank > > totem { > version: 2 > secauth: off > threads: 0 > interface { > ringnumber: 0 > bindnetaddr: 192.168.199.0 > mcastaddr: 226.94.1.1 > mcastport: 5405 > ttl: 1 > } > } > > logging { > fileline: off > to_stderr: no > to_logfile: yes > to_syslog: no > logfile: /var/log/cluster/corosync.log > debug: off > timestamp: on > logger_subsys { > subsys: AMF > debug: off > } > } > > amf { > mode: disabled > } > > service { > ver: 1 > name: pacemaker > } > > Any idea what could be missing/wrong? > > Kind regards, > > Felix > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org