Hi, I'm building a two-node cluster based on XenServer, Pacemaker and DRBD. All needed resources are configured and correctly handled by pacemaker, but currently I'm struggling with stonith / fencing.
Both physical servers are running XenServer and a couple of virtual machines which are being mirrored. For example, on each servers is an Apache-VM running which share a data partition over DRBD. I configured fencing over XEN, which is restarting any faulty VM reliable, as long as both physical servers are working correctly. Unfortunately fencing doesn't work when a server that hosts a faulty virtual machine is powered off or not reachable over the network. In this case pacemaker does not promote the DRBD partition on the second / passive virtual machine to the primary partition. Other resources, like the apache server, won't get started. I know that this is an expected behaviour of Pacemaker and DRBD, but I'm not sure what is needed to make the failover reliable even in the case of a completely broken physical server. Fencing by issuing a reboot of the broken server obviously is not an option since the server wouldn't come up due to a hardware defect. I appreciate any help on this. Thanks, Tim _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org