Hi Andrew, > It should already be in the main repo for 1.1 and we can backport to > pacemaker-1.0
The next correction was included in Pacemaker1.1, and I confirmed a thing. * https://github.com/ClusterLabs/pacemaker/commit/2a6b296b7ca42a1b671563f5ab73853ff2a8fcef#lib/common I look forward to the next release of Pacemaker1.0. Many Thanks! Hideo Yamauchi. --- On Thu, 2012/2/2, Andrew Beekhof <and...@beekhof.net> wrote: > On Wed, Feb 1, 2012 at 1:57 PM, <renayama19661...@ybb.ne.jp> wrote: > > Hi Lars, > > Hi Andrew, > > > > I confirmed that a problem did not occur with a patch of Mr. Andrew. > > * https://github.com/beekhof/pacemaker/commit/2a6b296 > > The examination that I carried out is repetition by start and a stop. > > > > Try 1. During 405 times, start/stop succeed. > > Try 2. During 407 times, start/stop succeed. > > Try 3. During 1228 times, start/stop succeed.(Because I carried it out on > > the weekend, there is much number of times) > > Try 4. During 408 times, start/stop succeed. > > Try 5. During 418 times, start/stop succeed. > > > > The problem was settled with the patch of Mr. Andrew. > > > > I wish this patch is reflected to Pacemaker1.1 and Pacemaker1.0. > > It should already be in the main repo for 1.1 and we can backport to > pacemaker-1.0 > > > > > Best Regard, > > Hideo Yamauchi. > > > > > > --- On Wed, 2012/1/25, renayama19661...@ybb.ne.jp > > <renayama19661...@ybb.ne.jp> wrote: > > > >> Hi Lars, > >> Hi Andrew, > >> > >> I confirmed that a problem did not occur with a patch of Mr. Lars. > >> The examination that I carried out is repetition by start and a stop. > >> > >> I tested it five times > >> > >> The results are as follows. > >> > >> Try 1. During 420 times, start/stop succeed. > >> Try 2. During 396 times, start/stop succeed. > >> Try 3. During 412 times, start/stop succeed. > >> Try 4. During 1221 times, start/stop succeed.(Because I carried it out > >> on the weekend, there is much number of times) > >> Try 5. During 420 times, start/stop succeed. > >> > >> This test environment is the environment that problems produce well. > >> I think that the problem is solved with a patch of Mr. Lars. > >> > >> Even a patch of Mr. Andrew carries out a similar test. > >> I carry out a test a little more and finally report a result. > >> > >> Best Regards, > >> Hideo Yamauchi. > >> > >> --- On Fri, 2012/1/20, renayama19661...@ybb.ne.jp > >> <renayama19661...@ybb.ne.jp> wrote: > >> > >> > Hi Lars, > >> > Hi Andrew, > >> > > >> > I test it now in the environment that the problem reproduces with a > >> > patch of Mr. Lars. > >> > * The patch of msgfromIPC_ll does not apply it. > >> > * The patch of crm_trigger_prepare applies it. > >> > > >> > The problem does not reappear on the test of several days for the moment. > >> > > >> > I carry out a test a little more and finally report a result. > >> > And I intend to carry out the same test with a patch of Mr. Andrew > >> > afterwards. > >> > * https://github.com/beekhof/pacemaker/commit/2a6b296 > >> > > >> > Best Regards, > >> > Hideo Yamauchi. > >> > > >> > > >> > --- On Fri, 2012/1/20, Lars Ellenberg <lars.ellenb...@linbit.com> wrote: > >> > > >> > > On Fri, Jan 20, 2012 at 09:21:58AM +1100, Angus Salkeld wrote: > >> > > > On 19/01/12 22:23 +0100, Lars Ellenberg wrote: > >> > > > >On Tue, Jan 17, 2012 at 12:13:37AM +0100, Lars Ellenberg wrote: > >> > > > >>On Tue, Jan 17, 2012 at 09:52:35AM +1100, Andrew Beekhof wrote: > >> > > > >>> > >> > > > >>> Ok, done: > >> > > > >>> > >> > > > >>> https://github.com/beekhof/pacemaker/commit/2a6b296 > >> > > > >>> > >> > > > >>> If I'm adding voodoo, I at least want the reason well documented > >> > > > >>> so it > >> > > > >>> can be removed again if the reason goes away. > >> > > > >> > >> > > > >>That about sums it up, then ;-) > >> > > > > > >> > > > >But as having to do this was just "too ugly to be true", > >> > > > >I dug a little deeper... > >> > > > > > >> > > > >The way to do this is obviously to use the glib api ;-) > >> > > > >http://developer.gnome.org/glib/2.30/glib-UNIX-specific-utilities-and-integration.html#g-unix-signal-add-full > >> > > > > > >> > > > >(Since glib 2.30, yay; if you don't have that yet, read on anyways) > >> > > > > > >> > > > >What it does internally, and what other people have also done for a > >> > > > >long > >> > > > >time to solve this and similar problems, is: > >> > > > > > >> > > > >Add to the main context a "wakeup pipe", > >> > > > >which is an eventfd if available, > >> > > > >or an actual pipe if not. > >> > > > >If it is a pipe, set those file descriptors non-blocking. > >> > > > >And, of course, add the evenfd (or the read end of the pipe) > >> > > > >to the poll loop (with default priority, btw, > >> > > > >which is good enough to have the poll terminate). > >> > > > > > >> > > > >That is done internally when creating the main context. > >> > > > >http://git.gnome.org/browse/glib/tree/glib/gmain.c#n548 > >> > > > >http://git.gnome.org/browse/glib/tree/glib/gwakeup.c#n138 > >> > > > > > >> > > > >(the line numbers are correct for glib master as of today, > >> > > > >which should correspond to 41fbf42) > >> > > > > > >> > > > >The g_unix_signal_handler then sets the triggers variables, > >> > > > >and calls g_wakeup_signal(that internal wakeup source), > >> > > > >which simply posts and event to the eventfd, > >> > > > >or does a short (1 byte) write to the write end of the pipe. > >> > > > >http://git.gnome.org/browse/glib/tree/glib/gmain.c#n4442 > >> > > > >http://git.gnome.org/browse/glib/tree/glib/gwakeup.c#n230 > >> > > > > > >> > > > >Problem solved, without having to do a full check() everything, > >> > > > >prepare() everything, and poll() again cycle every 500ms. > >> > > > > > >> > > > >"back in those days", when this mechanism was not really there, > >> > > > >you could do all that "by hand". > >> > > > >And people did. Very common idiom in glib and other mainloop > >> > > > >applications, also frequently used to "signal" availability of work > >> > > > >or completion of tasks between threads. > >> > > > > > >> > > > >static int my_wakeup_fds[2] = { -1, -1 }; > >> > > > > > >> > > > >Then just pipe2(my_wakeup_fds, O_NONBLOCK), add my_wakeup_fds[0] as > >> > > > >normal read fd source, and add a write(my_wakeup_fds[1], "", 1); to > >> > > > >the > >> > > > >signal handlers. > >> > > > > >> > > > signalfd makes this much easier too "man 2 signalfd" > >> > > > >> > > See https://bugzilla.gnome.org/show_bug.cgi?id=652072#c32 > >> > > following (or the whole bug, if you like). > >> > > > >> > > Also, pipes are portable. > >> > > > >> > > Lars > >> > > > >> > > > >> > > _______________________________________________ > >> > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >> > > > >> > > Project Home: http://www.clusterlabs.org > >> > > Getting started: > >> > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> > > Bugs: http://bugs.clusterlabs.org > >> > > > >> > > >> > _______________________________________________ > >> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >> > > >> > Project Home: http://www.clusterlabs.org > >> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> > Bugs: http://bugs.clusterlabs.org > >> > > >> > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org