Hi all, I'm currently working out a similar concept as Miki. The only difference is that my cluster would NOT be in active/active.
Here is a brief description of my scenario: 1. Three geographically distinct locations: A,B and X. There are no WAN connections. We have direct multiplexed fibre connections (same subnet) to those sites, so there are no issues with WAN timeouts and such, but split brain is an issue to! 2. Two cluster members: server1 @ A and server2 @ B 3. No shared storage like a SAN, but replicated data like DRBD, MySQL etc. 4. Cluster works in active/passive mode. No master/master, as this is to risky and has to many bottlenecks in case of disaster recovery! 5. Location X would have one server not hosting any ressources at any time, so this can only be some kind of quorum server. Miki, have you managed to get a working setup for your scenario? How finally? Here are my thoughts: * iSCSI reservation on a server @ location X could be an option, but I wonder how well this is working and if there are any cases out there and maybe some caveats, similar to a shared storage, only over IP? Are there any "out of the box" solutions for iSCSI reservation: Pacemaker RA, etc.? * How would a server @ location X as a third cluster member handle? This server would have location constraints, prohibiting ressources to run on it. His sole purpose would be to provide a vote, so the cluster would have a quorum in case of split brain between location A and B. As I already understood from a reply of Andrew, the quorum mechanism purely relies on the number of nodes joinable in the cluster, right!? * Considering the broken triangle scenario, as already mentioned by Miki, and a third "quorum" cluster member @ location X, would all the cluster members know from each other through some kind of relay of the multicast messages, i.e. server @ A does multicast, server @ X receives them and relays them to server @ B, correct!? If this is the case and my understanding is correct, the cluster would continue working as if "nothing happened"? * In case of complete site isolation (1-1-1 situation) the cluster would stop ressources, as I would set the quorum-policy to stop. Would the cluster restart ressources, once it reaches quorum again? Using drbd, would that correctly work when the slave would first get quorum and start working again? Once the old master finally comes back again, would the cluster return to a consistent state, especially the DRBD? Best regards, Vincent _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf