> On 22 Jan 2015, at 12:04 am, Vladislav Bogdanov <bub...@hoster-ok.com> wrote: > > 20.01.2015 02:44, Andrew Beekhof wrote: >> >>> On 16 Jan 2015, at 3:59 pm, Vladislav Bogdanov <bub...@hoster-ok.com> wrote: >>> >>> 16.01.2015 07:44, Andrew Beekhof wrote: >>>> >>>>> On 15 Jan 2015, at 3:11 pm, Vladislav Bogdanov <bub...@hoster-ok.com> >>>>> wrote: >>>>> >>>>> 13.01.2015 11:32, Andrei Borzenkov wrote: >>>>>> On Tue, Jan 13, 2015 at 10:20 AM, Vladislav Bogdanov >>>>>> <bub...@hoster-ok.com> wrote: >>>>>>> Hi Andrew, David, all. >>>>>>> >>>>>>> I found a little bit strange operation ordering during transition >>>>>>> execution. >>>>>>> >>>>>>> Could you please look at the following partial configuration (crmsh >>>>>>> syntax)? >>>>>>> >>>>>>> === >>>>>>> ... >>>>>>> clone cl-broker broker \ >>>>>>> meta interleave=true target-role=Started >>>>>>> clone cl-broker-vips broker-vips \ >>>>>>> meta clone-node-max=2 globally-unique=true interleave=true >>>>>>> resource-stickiness=0 target-role=Started >>>>>>> clone cl-ctdb ctdb \ >>>>>>> meta interleave=true target-role=Started >>>>>>> colocation broker-vips-with-broker inf: cl-broker-vips cl-broker >>>>>>> colocation broker-with-ctdb inf: cl-broker cl-ctdb >>>>>>> order broker-after-ctdb inf: cl-ctdb cl-broker >>>>>>> order broker-vips-after-broker 0: cl-broker cl-broker-vips >>>>>>> ... >>>>>>> === >>>>>>> >>>>>>> After I put one node to standby and then back to online, I see the >>>>>>> following transition (relevant excerpt): >>>>>>> >>>>>>> === >>>>>>> * Pseudo action: cl-broker-vips_stop_0 >>>>>>> * Resource action: broker-vips:1 stop on c-pa-0 >>>>>>> * Pseudo action: cl-broker-vips_stopped_0 >>>>>>> * Pseudo action: cl-ctdb_start_0 >>>>>>> * Resource action: ctdb start on c-pa-1 >>>>>>> * Pseudo action: cl-ctdb_running_0 >>>>>>> * Pseudo action: cl-broker_start_0 >>>>>>> * Resource action: ctdb monitor=10000 on c-pa-1 >>>>>>> * Resource action: broker start on c-pa-1 >>>>>>> * Pseudo action: cl-broker_running_0 >>>>>>> * Pseudo action: cl-broker-vips_start_0 >>>>>>> * Resource action: broker monitor=10000 on c-pa-1 >>>>>>> * Resource action: broker-vips:1 start on c-pa-1 >>>>>>> * Pseudo action: cl-broker-vips_running_0 >>>>>>> * Resource action: broker-vips:1 monitor=30000 on c-pa-1 >>>>>>> === >>>>>>> >>>>>>> What could be a reason to stop unique clone instance so early for move? >>>>>>> >>>>>> >>>>>> Do not take it as definitive answer, but cl-broker-vips cannot run >>>>>> unless both other resources are started. So if you compute closure of >>>>>> all required transitions it looks rather logical. Having >>>>>> cl-broker-vips started while broker is still stopped would violate >>>>>> constraint. >>>>> >>>>> Problem is that broker-vips:1 is stopped on one (source) node >>>>> unnecessarily early. >>>> >>>> It looks to be moving from c-pa-0 to c-pa-1 >>>> It might be unnecessarily early, but it is what you asked for... we have >>>> to unwind the resource stack before we can build it up. >>> >>> Yes, I understand that it is valid, but could its stop be delayed until >>> cluster is in the state when all dependencies are satisfied to start it on >>> another node (like migration?)? >> >> No, because "we have to unwind the resource stack before we can build it up." >> Doing anything else would be one of those things that is trivial for a human >> to identify but rather complex for a computer. > > I believe there is also an issue with migration of clone instances. > > I modified pe-input to allow migration of cl-broker-vips (and also set inf > score for broker-vips-after-broker > and make cl-broker-vips interleaved). > Relevant part is: > clone cl-broker broker \ > meta interleave=true target-role=Started > clone cl-broker-vips broker-vips \ > meta clone-node-max=2 globally-unique=true interleave=true > allow-migrate=true resource-stickiness=0 target-role=Started > clone cl-ctdb ctdb \ > meta interleave=true target-role=Started > colocation broker-vips-with-broker inf: cl-broker-vips cl-broker > colocation broker-with-ctdb inf: cl-broker cl-ctdb > order broker-after-ctdb inf: cl-ctdb cl-broker > order broker-vips-after-broker inf: cl-broker cl-broker-vips > > After that (part of) transition is: > > * Resource action: broker-vips:1 migrate_to on c-pa-0 > * Pseudo action: cl-broker-vips_stop_0 > * Resource action: broker-vips:1 migrate_from on c-pa-1 > * Resource action: broker-vips:1 stop on c-pa-0 > * Pseudo action: cl-broker-vips_stopped_0 > * Pseudo action: all_stopped > * Pseudo action: cl-ctdb_start_0 > * Resource action: ctdb start on c-pa-1 > * Pseudo action: cl-ctdb_running_0 > * Pseudo action: cl-broker_start_0 > * Resource action: ctdb monitor=10000 on c-pa-1 > * Resource action: broker start on c-pa-1 > * Pseudo action: cl-broker_running_0 > * Pseudo action: cl-broker-vips_start_0 > * Resource action: broker monitor=10000 on c-pa-1 > * Pseudo action: broker-vips:1_start_0 > * Pseudo action: cl-broker-vips_running_0 > * Resource action: broker-vips:1 monitor=30000 on c-pa-1
Have you got the PE file for this? I feel like we fixed something like this recently but I’d like to check it with your input. > > But, I would say that at least from a human logic PoV the above breaks > ordering rule broker-vips-after-broker > (cl-broker-vips finished migrating and thus runs on c-pa-1 before cl-broker > started there). > Technically broker-vips:1_start_0 goes at the right position, but actually > resource is "started" > in migrate_to/mifrate_from. > > > I also went further and injected a pair of non-clone IPAddr2 resources into > the same pe-input, and also enabled migration > for them (returning interleave for cl-broker-vips to false and setting > ordering score for broker-vips-after-broker back to 0, > so all three order constraints are adjacent): > > clone cl-broker broker \ > meta interleave=true target-role=Started > clone cl-broker-vips broker-vips \ > meta clone-node-max=2 globally-unique=true interleave=false > allow-migrate=true resource-stickiness=0 target-role=Started > clone cl-ctdb ctdb \ > meta interleave=true target-role=Started > primitive broker-vip1 IPaddr2 \ > params ip=192.168.122.70 cidr_netmask=24 nic=eth0 \ > op start interval=0 timeout=20 \ > op stop interval=0 timeout=20 \ > op monitor interval=30 > primitive broker-vip2 IPaddr2 \ > params ip=192.168.122.71 cidr_netmask=24 nic=eth0 \ > op start interval=0 timeout=20 \ > op stop interval=0 timeout=20 \ > op monitor interval=30 > colocation broker-with-ctdb inf: cl-broker cl-ctdb > colocation broker-vips-with-broker inf: cl-broker-vips cl-broker > colocation broker-vip1-with-broker inf: broker-vip1 cl-broker > colocation broker-vip2-with-broker inf: broker-vip2 cl-broker > colocation broker-vip2-not-with-vip1 -100: broker-vip2 broker-vip1 > order broker-after-ctdb inf: cl-ctdb cl-broker > order broker-vips-after-broker 0: cl-broker cl-broker-vips > order broker-vip1-after-broker 0: cl-broker broker-vip1 > order broker-vip2-after-broker 0: cl-broker broker-vip2 > > For broker-vip2 I see completely different output (compare with > broker-vips:1): > > * Resource action: broker-vips:1 migrate_to on c-pa-0 > * Pseudo action: cl-broker-vips_stop_0 > * Resource action: broker-vips:1 migrate_from on c-pa-1 > * Resource action: broker-vips:1 stop on c-pa-0 > * Pseudo action: cl-broker-vips_stopped_0 > * Pseudo action: cl-ctdb_start_0 > * Resource action: ctdb start on c-pa-1 > * Pseudo action: cl-ctdb_running_0 > * Pseudo action: cl-broker_start_0 > * Resource action: ctdb monitor=10000 on c-pa-1 > * Resource action: broker start on c-pa-1 > * Pseudo action: cl-broker_running_0 > * Resource action: broker-vip2 migrate_to on c-pa-0 > * Pseudo action: cl-broker-vips_start_0 > * Resource action: broker monitor=10000 on c-pa-1 > * Resource action: broker-vip2 migrate_from on c-pa-1 > * Resource action: broker-vip2 stop on c-pa-0 > * Pseudo action: broker-vips:1_start_0 > * Pseudo action: cl-broker-vips_running_0 > * Pseudo action: all_stopped > * Pseudo action: broker-vip2_start_0 > * Resource action: broker-vips:1 monitor=30000 on c-pa-1 > * Resource action: broker-vip2 monitor=30000 on c-pa-1 > > broker-vip2 is migrated much later than broker-vips:1, exactly at the point I > would expect to see. > > For me that means that some logic already exists which would allow to > postpone resource move until > everything is ready for it at the destination. > > I also tried to disable migration for broker-vip2, and in that case it was > also stopped too early. > > So, there are four cases, and for one of them I get expected result: > *) g-u clone, migration disabled - early stop > *) g-u clone, migration enabled - early stop > *) ordinary resource, migration disabled - early stop > *) ordinary resource, migration enabled - stop at the expected point > > The question is: > > Is it strictly impossible to make non-migratable resources behave the same > way as that migratable broker-vip2? > > (I'm pretty sure I didn't make a mess in details anywhere but I want to > recheck that all once again) > > Best, > Vladislav > >> >> Better to look at why broker-vips:1 needed to be moved. >> >>> >>> Like: >>> === >>> * Pseudo action: cl-ctdb_start_0 >>> * Resource action: ctdb start on c-pa-1 >>> * Pseudo action: cl-ctdb_running_0 >>> * Pseudo action: cl-broker_start_0 >>> * Resource action: ctdb monitor=10000 on c-pa-1 >>> * Resource action: broker start on c-pa-1 >>> * Pseudo action: cl-broker_running_0 >>> * Pseudo action: cl-broker-vips_start_0 >>> * Resource action: broker monitor=10000 on c-pa-1 >>> * Pseudo action: cl-broker-vips_stop_0 >>> * Resource action: broker-vips:1 stop on c-pa-0 >>> * Pseudo action: cl-broker-vips_stopped_0 >>> * Resource action: broker-vips:1 start on c-pa-1 >>> * Pseudo action: cl-broker-vips_running_0 >>> * Resource action: broker-vips:1 monitor=30000 on c-pa-1 >>> === >>> That would be the great optimization toward five nines... >>> >>> Best, >>> Vladislav >>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org