On 28/10/14 05:59 AM, philipp.achmuel...@arz.at wrote:
hi,

any recommendation/documentation for a reliable fencing implementation
on a multi-node cluster (4 or 6 nodes on 2 site).
i think of implementing multiple node-fencing devices for each host to
stonith remaining nodes on other site?

thank you!
Philipp

Multi-site clustering is very hard to do well because of fencing issues. How do you distinguish a site failure from severed links? Given that a failed fence action can not be assumed to be a success, then the only safe option is to block until a human intervenes. This makes your cluster as reliable as your WAN between the sites, which is too say, not very reliable. In any case, the destruction of a site will require manual failover, which can be complicated if insufficient nodes remain to form quorum.

Generally, I'd recommend to different clusters, one per site, with manual/service-level failover in the case of a disaster.

In any case; A good fencing setup should have two fence methods. Personally, I always use IPMI as a primary fence method (routed through one switch) and a pair of switched PDUs as backup (via a backup switch). This way, when IPMI is available, a confirmed fence is 100% certain to be good. However, if the node is totally disabled/destroyed, IPMI will be lost and the cluster will switch to the switched PDUs, cutting the power outlets feeding the node.

I've got a block diagram of how I do this:

https://alteeve.ca/w/AN!Cluster_Tutorial_2#A_Map.21

It's trivial to scale the idea up to multiple node clusters.

Cheers

--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without access to education?

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to