On 07/26/2012 12:34 PM, Cal Heldenbrand wrote:
Hi everybody,

I've read through the Clusters from Scratch document, but it doesn't seem to help me very well with an N+1 (shared hot spare) style cluster setup.

My test case, is I have 3 memcache servers. Two are in primary use (hashed 50/50 by the clients) and one is a hot failover.

It sounds like you want to do this:

1) run memcache on each node

I'd use a clone to run memcache, instead of having three memcache primitives as you had done. Something like this:

primitive memcache ...
clone memcache_clone memcache ordered=False

There are many parameters a clone can take, but this is a good start, assuming you just want to run memcache on each node, and they can be started in any order. You don't need to specify any location constraints to say where memcache can run, or to keep the memcache instances from running multiple times on one node. The clone handles all of that.


2) have ip1 on a node with a working memcache

primitive ip1 ...
colocation ip1_on_memcache inf: ip1 memcache_clone


3) have ip2 active on a different node with a working memcache

primitive ip2 ...
colocation ip2_on_memcache inf: ip2 memcache_clone
colocation ip2_not_on_ip1 -10000: ip2 ip1

I've chosen a score of -10000 for ip2_not_on_ip1 because I assume you could, if you had no other choice, run both IPs on one node. If you'd rather run just one IP if there is only one working memcache, you can make this -inf, and you can set the priority attribute on the ip primitives to determine which one is sacrificed.

You could also use a clone for the ip addresses, but since there are only 2, simply having two primitives may be easier to understand. If you added a third active node, you'd require four colocation constraints ((n-1)^2, in the general case) to keep all the IPs running on different nodes. Your configuration would get very hairy, and you'd want to use a clone.


4) you have some preferences about which servers are active in a non-failure situation

location ip1_on_mem1 ip1 mem1: 100
location ip2_on_mem2 ip2 mem2: 100


5) (guessing you want this, most people do) if resources have migrated due to a failure, you'd prefer to leave them where they are, rather than move them again as soon as the failed node recovers. This way you can migrate them when the service interruption is convenient.

primitive ... meta resource-stickiness=500

or

rsc_defaults resource-stickiness=500

I prefer to set stickiness on specific primitives I want to be sticky, in this case, the IP addresses seem appropriate. Setting a default stickiness is a common suggestion, but I always find it hard to know how sticky things will be, since if there are colocation constraints, groups, etc, the stickinesses of other resources combine in deterministic and well defined, but complex and difficult to predict ways.

Your stickiness score must be greater than your location score (from #4) to have any effect.

crm_simulate is very handy for examining the scores used in placing resources. Start with "crm_simulate -LSs". You can also use the -u and -d options to simulate nodes coming online or offline. There are many more options -- definitely check it out. Documentation is scant (--help), but usage is fairly obvious after playing with it a bit.

Also, some advanced techniques allow the stickness score to be based on the time of day, so you can allow resources to move automatically back to their preferred nodes, but only at planned times. More information: http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/s-expression-iso8601.html


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to