I turned up the logging level in the pengine during processing of the rsc_order section. This shows the loop being formed between world2 and world1 resources, but only for stopping, not for starting.
-Frank > -----Original Message----- > From: Frank DiMeo [mailto:frank.di...@bigbandnet.com] > Sent: Wednesday, December 02, 2009 2:59 PM > To: pacemaker@oss.clusterlabs.org > Subject: Re: [Pacemaker] bug in ordering syntax? > > Here's a two resource version of the same issue. It's easy to see the > loop here. > > -Frank > > > -----Original Message----- > > From: Frank DiMeo [mailto:frank.di...@bigbandnet.com] > > Sent: Wednesday, December 02, 2009 2:13 PM > > To: pacemaker@oss.clusterlabs.org > > Subject: Re: [Pacemaker] bug in ordering syntax? > > > > Here's the output of ptest for the pe-input-***.bz2 file that's > > created when I put ubuntu_2 into standby and the cluster tries to > move > > my 4 resources from ubuntu_2 to ubuntu_1 (while running the compact > > ordering syntax with a score of INFINITY). > > > > I've converted it to a .png for your viewing pleasure. > > > > -Frank > > > > > -----Original Message----- > > > From: Andrew Beekhof [mailto:and...@beekhof.net] > > > Sent: Wednesday, December 02, 2009 6:00 AM > > > To: pacemaker@oss.clusterlabs.org > > > Subject: Re: [Pacemaker] bug in ordering syntax? > > > > > > On Mon, Nov 30, 2009 at 9:19 PM, Frank DiMeo > > > <frank.di...@bigbandnet.com> wrote: > > > > I'm experimenting with startup sequence and co-location control, > > and > > > think I > > > > may have stumbled across a bug. > > > > > > > > > > > > > > > > I have two xml files that I use in my testing as my initial > > > configuration of > > > > a two node cluster. I start each node with no configuration, and > > > then use > > > > cibadmin to "source in" the xml file. Each file defines two > > > resources as > > > > well as a startup order and collocation definition. The only > > > difference > > > > between the two files is the syntax I use to specify the startup > > > order. > > > > > > > > > > > > > > > > When I use the syntax: > > > > > > > > > > > > > > > > <rsc_order id="order-1" first="world1" then="world2" > > score="INFINITY" > > > /> > > > > > > > > > > > > > > > > Everything works fine. I can put either of the two nodes into > > > standby while > > > > resources are running there, and the resources move to the other > > > > node > > > as > > > > expected. > > > > > > > > > > > > > > > > However, when I use the syntax: > > > > > > > > > > > > > > > > - <<rsc_order id="order-1"> > > > > > > You're missing a score. Without one it defaults to 0 (which means > > > optional). > > > However, IIRC, the 1.0.6 schema won't allow you to set a score > there > > > so you'll need to apply the following patch: > > > http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/c8585629629c > > > > > > > > > > > - < <resource_set id="order-1-set-1" sequential="true"> > > > > > > > > < <resource_ref id="world1" /> > > > > > > > > < <resource_ref id="world2" /> > > > > > > > > </resource_set> > > > > > > > > </rsc_order> > > > > > > > > > > > > > > > > > > > > > > > > Several bad things happen. First, the resources don't move off > > > > the > > > node > > > > that is put into standby, even though the alternate node is > > > > running > > > and able > > > > to run the resources. > > > > > > Did you remove the other ordering constraint first? > > > > > > > Second, attempting to shut down openais on the node running the > > > > resources after attempting a forced move (by putting the > > > node > > > > into standby) leaves both the lrmd and pengine processes running > > > > (but children of process 1 (init), and the resources continue to > > run > > > > on > > > the that > > > > node even after openais is stopped. > > > > > > I suspect you've a faulty init script there. See other email. > > > > > > > I turned debug on in crmd and in the logs and recorded what > > > > happens > > > when I > > > > force standby, and I notice that using the first syntax causes > > > > te_rsc_command to be executed to send a shut down message to the > > > > node > > > where > > > > the resources are running (which seems to work), while using the > > > second > > > > syntax causes te_pseudo_action to be called in approximately the > > > > same > > > place > > > > in the log, but no shutdown of resources happens (I can't really > > > > tell > > > what > > > > this is supposed to be doing). > > > > > > Neither can I - you didnt attach the logs :-) > > > > > > _______________________________________________ > > > Pacemaker mailing list > > > Pacemaker@oss.clusterlabs.org > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
pengine_debug.log
Description: pengine_debug.log
_______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker