On 5 Jun 2014, at 12:57 am, Patrick Hemmer <pacema...@feystorm.net> wrote:
> From: Andrew Beekhof <and...@beekhof.net> > Sent: 2014-06-04 04:15:48 E > To: The Pacemaker cluster resource manager <pacemaker@oss.clusterlabs.org> > Subject: Re: [Pacemaker] resources not rebalancing > >> On 4 Jun 2014, at 4:22 pm, Patrick Hemmer <pacema...@feystorm.net> >> wrote: >> >> >>> Testing some different scenarios, and after bringing a node back online, >>> none of the resources move to it unless they are restarted. However >>> default-resource-stickiness is set to 0, so they should be able to move >>> around freely. >>> >>> # pcs status >>> Cluster name: docker >>> Last updated: Wed Jun 4 06:09:26 2014 >>> Last change: Wed Jun 4 06:08:40 2014 via cibadmin on i-093f1f55 >>> Stack: corosync >>> Current DC: i-083f1f54 (3) - partition with quorum >>> Version: 1.1.11-1.fc20-9d39a6b >>> 3 Nodes configured >>> 8 Resources configured >>> >>> >>> Online: [ i-053f1f59 i-083f1f54 i-093f1f55 ] >>> >>> Full list of resources: >>> >>> dummy2 (ocf::pacemaker:Dummy): Started i-083f1f54 >>> Clone Set: dummy1-clone [dummy1] (unique) >>> dummy1:0 (ocf::pacemaker:Dummy): Started i-083f1f54 >>> dummy1:1 (ocf::pacemaker:Dummy): Started i-093f1f55 >>> dummy1:2 (ocf::pacemaker:Dummy): Started i-093f1f55 >>> dummy1:3 (ocf::pacemaker:Dummy): Started i-083f1f54 >>> dummy1:4 (ocf::pacemaker:Dummy): Started i-093f1f55 >>> >>> # pcs resource show --all >>> Resource: dummy2 (class=ocf provider=pacemaker type=Dummy) >>> Clone: dummy1-clone >>> Meta Attrs: clone-max=5 clone-node-max=5 globally-unique=true >>> Resource: dummy1 (class=ocf provider=pacemaker type=Dummy) >>> >>> # pcs property show --all | grep default-resource-stickiness >>> default-resource-stickiness: 0 >>> >>> Notice how i-053f1f59 isn't running anything. I feel like I'm missing >>> something obvious, but it escapes me. >>> >> clones are ever so slightly sticky by default, try setting >> resource-stickiness=0 for the clone resource >> (and unset it once everything has moved back) >> >> > > Thanks, that did indeed fix it. But how come dummy2 didn't move? It's not a > clone, but it didn't move either? Do you have a location constraint that says it should prefer i-053f1f59? > > And now a separate follow up question, the resources didn't balance as they > should. I've got several utilization attributes set, and the resources aren't > balanced according to the placement-strategy. > > # pcs property show placement-strategy > Cluster Properties: > placement-strategy: balanced > > # crm_simulate -URL > > Current cluster status: > Online: [ i-053f1f59 i-083f1f54 i-093f1f55 ] > > dummy2 (ocf::pacemaker:Dummy): Started i-053f1f59 > Clone Set: dummy1-clone [dummy1] (unique) > dummy1:0 (ocf::pacemaker:Dummy): Started i-053f1f59 > dummy1:1 (ocf::pacemaker:Dummy): Started i-093f1f55 > dummy1:2 (ocf::pacemaker:Dummy): Started i-083f1f54 > dummy1:3 (ocf::pacemaker:Dummy): Started i-083f1f54 > dummy1:4 (ocf::pacemaker:Dummy): Started i-093f1f55 > > Utilization information: > Original: i-053f1f59 capacity: cpu=5000000 mem=3840332000 > Original: i-083f1f54 capacity: cpu=5000000 mem=3840332000 > Original: i-093f1f55 capacity: cpu=5000000 mem=3840332000 > calculate_utilization: dummy2 utilization on i-053f1f59: cpu=10000 > calculate_utilization: dummy1:2 utilization on i-083f1f54: cpu=1000 > calculate_utilization: dummy1:1 utilization on i-093f1f55: cpu=1000 > calculate_utilization: dummy1:0 utilization on i-053f1f59: cpu=1000 > calculate_utilization: dummy1:3 utilization on i-083f1f54: cpu=1000 > calculate_utilization: dummy1:4 utilization on i-093f1f55: cpu=1000 > Remaining: i-053f1f59 capacity: cpu=4989000 mem=3840332000 > Remaining: i-083f1f54 capacity: cpu=4998000 mem=3840332000 > Remaining: i-093f1f55 capacity: cpu=4998000 mem=3840332000 > > > > The "balanced" strategy is defined as: "the node that has more free capacity > gets consumed first". > Notice that dummy2 consumes cpu=10000, while dummy1 is only 1000 (10x less). > After dummy2 was placed on i-053f1f59, that should have consumed enough "cpu" > resource to keep dummy1 off it and on the other 2 nodes, but dummy1:0 got > placed on the node. But i-053f1f59 still has orders of magnitude more cpu capacity left to run things. > > Also how difficult is it to add a strategy? It might be challenging, the policy engine is deep voodoo :) Can you create an entry at bugs.clusterlabs.org and include the result of 'cibadmin -Q' when the cluster is in the state you describe above? It wont make it into 1.1.12 but we can look at it for .13 > I'd be interested in having a strategy which places a resource on a node with > the least amount of capacity used? Kind of the inverse of "balanced". The > docs say balanced looks at much capacity is free. The 2 strategies would be > equivalent if all nodes have the same capacity, but if one node has 10x the > capacity of the other nodes, I want the resources to be distributed evenly > (based on the capacity each uses), and not over-utilize that one node. > > Thanks > > -Patrick > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org