> On 17 Apr 2015, at 4:19 pm, Vladislav Bogdanov <bub...@hoster-ok.com> wrote: > > 17.04.2015 00:48, Andrew Beekhof wrote: >> >>> On 22 Jan 2015, at 12:04 am, Vladislav Bogdanov <bub...@hoster-ok.com> >>> wrote: >>> >>> 20.01.2015 02:44, Andrew Beekhof wrote: >>>> >>>>> On 16 Jan 2015, at 3:59 pm, Vladislav Bogdanov <bub...@hoster-ok.com> >>>>> wrote: >>>>> >>>>> 16.01.2015 07:44, Andrew Beekhof wrote: >>>>>> >>>>>>> On 15 Jan 2015, at 3:11 pm, Vladislav Bogdanov <bub...@hoster-ok.com> >>>>>>> wrote: >>>>>>> >>>>>>> 13.01.2015 11:32, Andrei Borzenkov wrote: >>>>>>>> On Tue, Jan 13, 2015 at 10:20 AM, Vladislav Bogdanov >>>>>>>> <bub...@hoster-ok.com> wrote: >>>>>>>>> Hi Andrew, David, all. >>>>>>>>> >>>>>>>>> I found a little bit strange operation ordering during transition >>>>>>>>> execution. >>>>>>>>> >>>>>>>>> Could you please look at the following partial configuration (crmsh >>>>>>>>> syntax)? >>>>>>>>> >>>>>>>>> === >>>>>>>>> ... >>>>>>>>> clone cl-broker broker \ >>>>>>>>> meta interleave=true target-role=Started >>>>>>>>> clone cl-broker-vips broker-vips \ >>>>>>>>> meta clone-node-max=2 globally-unique=true interleave=true >>>>>>>>> resource-stickiness=0 target-role=Started >>>>>>>>> clone cl-ctdb ctdb \ >>>>>>>>> meta interleave=true target-role=Started >>>>>>>>> colocation broker-vips-with-broker inf: cl-broker-vips cl-broker >>>>>>>>> colocation broker-with-ctdb inf: cl-broker cl-ctdb >>>>>>>>> order broker-after-ctdb inf: cl-ctdb cl-broker >>>>>>>>> order broker-vips-after-broker 0: cl-broker cl-broker-vips >>>>>>>>> ... >>>>>>>>> === >>>>>>>>> >>>>>>>>> After I put one node to standby and then back to online, I see the >>>>>>>>> following transition (relevant excerpt): >>>>>>>>> >>>>>>>>> === >>>>>>>>> * Pseudo action: cl-broker-vips_stop_0 >>>>>>>>> * Resource action: broker-vips:1 stop on c-pa-0 >>>>>>>>> * Pseudo action: cl-broker-vips_stopped_0 >>>>>>>>> * Pseudo action: cl-ctdb_start_0 >>>>>>>>> * Resource action: ctdb start on c-pa-1 >>>>>>>>> * Pseudo action: cl-ctdb_running_0 >>>>>>>>> * Pseudo action: cl-broker_start_0 >>>>>>>>> * Resource action: ctdb monitor=10000 on c-pa-1 >>>>>>>>> * Resource action: broker start on c-pa-1 >>>>>>>>> * Pseudo action: cl-broker_running_0 >>>>>>>>> * Pseudo action: cl-broker-vips_start_0 >>>>>>>>> * Resource action: broker monitor=10000 on c-pa-1 >>>>>>>>> * Resource action: broker-vips:1 start on c-pa-1 >>>>>>>>> * Pseudo action: cl-broker-vips_running_0 >>>>>>>>> * Resource action: broker-vips:1 monitor=30000 on c-pa-1 >>>>>>>>> === >>>>>>>>> >>>>>>>>> What could be a reason to stop unique clone instance so early for >>>>>>>>> move? >>>>>>>>> >>>>>>>> >>>>>>>> Do not take it as definitive answer, but cl-broker-vips cannot run >>>>>>>> unless both other resources are started. So if you compute closure of >>>>>>>> all required transitions it looks rather logical. Having >>>>>>>> cl-broker-vips started while broker is still stopped would violate >>>>>>>> constraint. >>>>>>> >>>>>>> Problem is that broker-vips:1 is stopped on one (source) node >>>>>>> unnecessarily early. >>>>>> >>>>>> It looks to be moving from c-pa-0 to c-pa-1 >>>>>> It might be unnecessarily early, but it is what you asked for... we have >>>>>> to unwind the resource stack before we can build it up. >>>>> >>>>> Yes, I understand that it is valid, but could its stop be delayed until >>>>> cluster is in the state when all dependencies are satisfied to start it >>>>> on another node (like migration?)? >>>> >>>> No, because "we have to unwind the resource stack before we can build it >>>> up." >>>> Doing anything else would be one of those things that is trivial for a >>>> human to identify but rather complex for a computer. >>> >>> I believe there is also an issue with migration of clone instances. >>> >>> I modified pe-input to allow migration of cl-broker-vips (and also set inf >>> score for broker-vips-after-broker >>> and make cl-broker-vips interleaved). >>> Relevant part is: >>> clone cl-broker broker \ >>> meta interleave=true target-role=Started >>> clone cl-broker-vips broker-vips \ >>> meta clone-node-max=2 globally-unique=true interleave=true >>> allow-migrate=true resource-stickiness=0 target-role=Started >>> clone cl-ctdb ctdb \ >>> meta interleave=true target-role=Started >>> colocation broker-vips-with-broker inf: cl-broker-vips cl-broker >>> colocation broker-with-ctdb inf: cl-broker cl-ctdb >>> order broker-after-ctdb inf: cl-ctdb cl-broker >>> order broker-vips-after-broker inf: cl-broker cl-broker-vips >>> >>> After that (part of) transition is: >>> >>> * Resource action: broker-vips:1 migrate_to on c-pa-0 >>> * Pseudo action: cl-broker-vips_stop_0 >>> * Resource action: broker-vips:1 migrate_from on c-pa-1 >>> * Resource action: broker-vips:1 stop on c-pa-0 >>> * Pseudo action: cl-broker-vips_stopped_0 >>> * Pseudo action: all_stopped >>> * Pseudo action: cl-ctdb_start_0 >>> * Resource action: ctdb start on c-pa-1 >>> * Pseudo action: cl-ctdb_running_0 >>> * Pseudo action: cl-broker_start_0 >>> * Resource action: ctdb monitor=10000 on c-pa-1 >>> * Resource action: broker start on c-pa-1 >>> * Pseudo action: cl-broker_running_0 >>> * Pseudo action: cl-broker-vips_start_0 >>> * Resource action: broker monitor=10000 on c-pa-1 >>> * Pseudo action: broker-vips:1_start_0 >>> * Pseudo action: cl-broker-vips_running_0 >>> * Resource action: broker-vips:1 monitor=30000 on c-pa-1 >>> >>> But, I would say that at least from a human logic PoV the above breaks >>> ordering rule broker-vips-after-broker >>> (cl-broker-vips finished migrating and thus runs on c-pa-1 before cl-broker >>> started there). >>> Technically broker-vips:1_start_0 goes at the right position, but actually >>> resource is "started" >>> in migrate_to/mifrate_from. >>> >>> >>> I also went further and injected a pair of non-clone IPAddr2 resources into >>> the same pe-input, and also enabled migration >>> for them (returning interleave for cl-broker-vips to false and setting >>> ordering score for broker-vips-after-broker back to 0, >>> so all three order constraints are adjacent): >>> >>> clone cl-broker broker \ >>> meta interleave=true target-role=Started >>> clone cl-broker-vips broker-vips \ >>> meta clone-node-max=2 globally-unique=true interleave=false >>> allow-migrate=true resource-stickiness=0 target-role=Started >>> clone cl-ctdb ctdb \ >>> meta interleave=true target-role=Started >>> primitive broker-vip1 IPaddr2 \ >>> params ip=192.168.122.70 cidr_netmask=24 nic=eth0 \ >>> op start interval=0 timeout=20 \ >>> op stop interval=0 timeout=20 \ >>> op monitor interval=30 >>> primitive broker-vip2 IPaddr2 \ >>> params ip=192.168.122.71 cidr_netmask=24 nic=eth0 \ >>> op start interval=0 timeout=20 \ >>> op stop interval=0 timeout=20 \ >>> op monitor interval=30 >>> colocation broker-with-ctdb inf: cl-broker cl-ctdb >>> colocation broker-vips-with-broker inf: cl-broker-vips cl-broker >>> colocation broker-vip1-with-broker inf: broker-vip1 cl-broker >>> colocation broker-vip2-with-broker inf: broker-vip2 cl-broker >>> colocation broker-vip2-not-with-vip1 -100: broker-vip2 broker-vip1 >>> order broker-after-ctdb inf: cl-ctdb cl-broker >>> order broker-vips-after-broker 0: cl-broker cl-broker-vips >>> order broker-vip1-after-broker 0: cl-broker broker-vip1 >>> order broker-vip2-after-broker 0: cl-broker broker-vip2 >>> >>> For broker-vip2 I see completely different output (compare with >>> broker-vips:1): >>> >>> * Resource action: broker-vips:1 migrate_to on c-pa-0 >> >> I just noticed this, since when does IPaddr2 migrate? > > I just injected allow_migrate for broker-vip1, broker-vip2 and broker-vips > into the pe_input to test what would pengine do
The force is strong with this one… > but forgot to note that (actually cl-broker-vips definition above has it > enabled but broker-vip{1,2} misses that, damn, my fault, it should be there > too). I need to be more accurate. > For g-u clone it doesn't solve the issue btw. But for ordinary resource it > does. That makes me think that migration paths differ for g-u clone instances. Highly likely. > Actually, implementing (pseudo-)migration in IPaddr2 doesn't seem to be very > complex task. > >> >> Reason I noticed is because broker-vips definitely doesn’t start until the >> end anymore: >> >> * Resource action: broker start on c-pa-1 >> * Pseudo action: cl-broker_running_0 >> * Pseudo action: cl-broker-vips_start_0 >> * Resource action: broker monitor=10000 on c-pa-1 >> * Resource action: broker-vips:1 start on c-pa-1 > > Actually it is migrated at the very beginning of the transition, Not with rc2 is what I’m saying: [06:17 AM] beekhof@fedora ~/Development/sources/pacemaker/1.1 ☺ # tools/crm_simulate -Sx ~/Downloads/pe-input-418.bz2 | grep broker-vips Clone Set: cl-broker-vips [broker-vips] (unique) broker-vips:0 (ocf::heartbeat:IPaddr2): Started c-pa-0 broker-vips:1 (ocf::heartbeat:IPaddr2): Started c-pa-0 * Move broker-vips:1 (Started c-pa-0 -> c-pa-1) * Pseudo action: cl-broker-vips_stop_0 * Resource action: broker-vips:1 stop on c-pa-0 * Pseudo action: cl-broker-vips_stopped_0 * Pseudo action: cl-broker-vips_start_0 * Resource action: broker-vips:1 start on c-pa-1 * Pseudo action: cl-broker-vips_running_0 * Resource action: broker-vips:1 monitor=30000 on c-pa-1 Clone Set: cl-broker-vips [broker-vips] (unique) broker-vips:0 (ocf::heartbeat:IPaddr2): Started c-pa-0 broker-vips:1 (ocf::heartbeat:IPaddr2): Started c-pa-1 > and that seems to be a big issue to me, because it breaks ordering (start > became a pseudo-action, but actual work should be done in migrate_from which > is run before broker start). > >> >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org