On 17/07/14 02:39 PM, Alex Samad - Yieldbroker wrote:
-----Original Message-----
From: Digimer [mailto:li...@alteeve.ca]
Sent: Thursday, 17 July 2014 3:00 PM
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] clusters on virtualised platforms

On 17/07/14 01:41 PM, Alex Samad - Yieldbroker wrote:


-----Original Message-----
From: Digimer [mailto:li...@alteeve.ca]
Sent: Thursday, 17 July 2014 2:02 PM
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] clusters on virtualised platforms

Don't confuse quorum and fencing (stonith), they serve different
purposes.
Basically, quorum is useful when things are working, fencing is
required when things go wrong. So regardless of quorum disk, you
still need to be able to fence. This requires that each VM be able to
call the hypervisor and force a power off.

TA, yep got that, I was thinking and writing ..


Generally speaking, VM-based cluster nodes are good for learning, but
not production. It adds a layer that isn't needed and in HA, simple
should trump all else.

Yeah well, it's not really going to change for us, we are virtualised and I 
can't
really see that changing. In fact I would presume you would see more of it.

Thanks

Then ensure that each VM is on a different host, otherwise the host itself
becomes a single point of failure. Further, you will want to add a backup
fence method that can take out the host if it stops responding.
Otherwise, a failure in the host would leave the target node's fence method
(the hypervisor) inaccessible, and a failed fence method can only be handled
safely by effectively hanging the cluster. It is not allowed that no response be
treated as confirmation of node death, lest you end up with inevitable split-
brains.

Yes, but I think I will live with ESX not crashing, plus I have my hosts in a 
cluster, with auto restart of vm's.

I think I am happy to presume the host will not fail, I think I have to extend 
that to VC as well. I do realise that the VC is much less reliable than esxi.  
But I  have constraints I have to live with.

That's quite the assumption to make in an HA environment, but as you said, you need to choose the failure scenarios that you will accept taking the cluster out.

It is not the choice I would make, however.

My major thrust with the question was more along the lines or
How are people handing fencing with virtualisation. Is every one installing the 
VMWare SDK and creating users that can shutdown just those hosts or is there 
another acceptable (albeit not perfect) fencing method

For fencing to work, it must be able to power off the target regardless of the state the target may be in. That means that fencing must exit outside the target, and in VMs, that means via the hypervisor. In my case, I use KVM VMs so I use fence_xvm or fence_virsh. I don't use VMWare, so I can't speak to that.

The other question was in the calculation of quorum can I add the accessibility 
to a dgw in the calculation

The only way to get quorum is either with a third node or with quorum disk. With a quorum disk, you can use heuristics. That said, I don't know what level of support there is in pacemaker for qdisk (maybe perfect, I really don't know).

I do see people try to do this more often, and I will continue to discourage
it... "An HA cluster is beautiful not when there is nothing left to add, but
when there is nothing left to take away.". Every piece in the cluster is
another potential point of failure.

In a life many moons ago, I used to build MS Clusters and Oracle (linux) rac 
clusters all on phy boxes.

But I think virt is here to stay, and for us its right, so I am trying to shoe 
horn in a cluster solution that works as well, without paying for vmware HA.  
Some compromises we will have to live with. I think trusting in esx and vc are 
valid ones



--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without access to education?

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to