On 17/07/14 02:39 PM, Alex Samad - Yieldbroker wrote:
-----Original Message-----
From: Digimer [mailto:li...@alteeve.ca]
Sent: Thursday, 17 July 2014 3:00 PM
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] clusters on virtualised platforms
On 17/07/14 01:41 PM, Alex Samad - Yieldbroker wrote:
-----Original Message-----
From: Digimer [mailto:li...@alteeve.ca]
Sent: Thursday, 17 July 2014 2:02 PM
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] clusters on virtualised platforms
Don't confuse quorum and fencing (stonith), they serve different
purposes.
Basically, quorum is useful when things are working, fencing is
required when things go wrong. So regardless of quorum disk, you
still need to be able to fence. This requires that each VM be able to
call the hypervisor and force a power off.
TA, yep got that, I was thinking and writing ..
Generally speaking, VM-based cluster nodes are good for learning, but
not production. It adds a layer that isn't needed and in HA, simple
should trump all else.
Yeah well, it's not really going to change for us, we are virtualised and I
can't
really see that changing. In fact I would presume you would see more of it.
Thanks
Then ensure that each VM is on a different host, otherwise the host itself
becomes a single point of failure. Further, you will want to add a backup
fence method that can take out the host if it stops responding.
Otherwise, a failure in the host would leave the target node's fence method
(the hypervisor) inaccessible, and a failed fence method can only be handled
safely by effectively hanging the cluster. It is not allowed that no response be
treated as confirmation of node death, lest you end up with inevitable split-
brains.
Yes, but I think I will live with ESX not crashing, plus I have my hosts in a
cluster, with auto restart of vm's.
I think I am happy to presume the host will not fail, I think I have to extend
that to VC as well. I do realise that the VC is much less reliable than esxi.
But I have constraints I have to live with.
That's quite the assumption to make in an HA environment, but as you
said, you need to choose the failure scenarios that you will accept
taking the cluster out.
It is not the choice I would make, however.
My major thrust with the question was more along the lines or
How are people handing fencing with virtualisation. Is every one installing the
VMWare SDK and creating users that can shutdown just those hosts or is there
another acceptable (albeit not perfect) fencing method
For fencing to work, it must be able to power off the target regardless
of the state the target may be in. That means that fencing must exit
outside the target, and in VMs, that means via the hypervisor. In my
case, I use KVM VMs so I use fence_xvm or fence_virsh. I don't use
VMWare, so I can't speak to that.
The other question was in the calculation of quorum can I add the accessibility
to a dgw in the calculation
The only way to get quorum is either with a third node or with quorum
disk. With a quorum disk, you can use heuristics. That said, I don't
know what level of support there is in pacemaker for qdisk (maybe
perfect, I really don't know).
I do see people try to do this more often, and I will continue to discourage
it... "An HA cluster is beautiful not when there is nothing left to add, but
when there is nothing left to take away.". Every piece in the cluster is
another potential point of failure.
In a life many moons ago, I used to build MS Clusters and Oracle (linux) rac
clusters all on phy boxes.
But I think virt is here to stay, and for us its right, so I am trying to shoe
horn in a cluster solution that works as well, without paying for vmware HA.
Some compromises we will have to live with. I think trusting in esx and vc are
valid ones
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org