2014/1/9 Andrew Beekhof <and...@beekhof.net>: > > On 8 Jan 2014, at 9:15 pm, Kazunori INOUE <kazunori.ino...@gmail.com> wrote: > >> 2014/1/8 Andrew Beekhof <and...@beekhof.net>: >>> >>> On 18 Dec 2013, at 9:50 pm, Kazunori INOUE <kazunori.ino...@gmail.com> >>> wrote: >>> >>>> Hi David, >>>> >>>> 2013/12/18 David Vossel <dvos...@redhat.com>: >>>>> >>>>> That's a really weird one... I don't see how it is possible for op->id to >>>>> be NULL there. You might need to give valgrind a shot to detect >>>>> whatever is really going on here. >>>>> >>>>> -- Vossel >>>>> >>>> Thank you for advice. I try it. >>> >>> Any update on this? >>> >> >> We are still investigating a cause. It was not reproduced when I gave >> valgrind.. >> And it was reproduced in RC3. > > So it happened RC3 - valgrind, but not RC3 + valgrind? > Thats concerning. > > Nothing in the valgrind output? >
The cause was found. 230 gboolean 231 operation_finalize(svc_action_t * op) 232 { 233 int recurring = 0; 234 235 if (op->interval) { 236 if (op->cancel) { 237 op->status = PCMK_LRM_OP_CANCELLED; 238 cancel_recurring_action(op); 239 } else { 240 recurring = 1; 241 op->opaque->repeat_timer = g_timeout_add(op->interval, 242 recurring_action_timer, (void *)op); 243 } 244 } 245 246 if (op->opaque->callback) { 247 op->opaque->callback(op); 248 } 249 250 op->pid = 0; 251 252 if (!recurring) { 253 /* 254 * If this is a recurring action, do not free explicitly. 255 * It will get freed whenever the action gets cancelled. 256 */ 257 services_action_free(op); 258 return TRUE; 259 } 260 return FALSE; 261 } When op->id is not 0, in cancel_recurring_action function (l.238), op is not removed from hash table. However, op is freed in services_action_free function (l.257). That is, the freed data remains in hash table. Then, g_hash_table_lookup function may look up the freed data. Therefore, when g_hash_table_replace function was called (in services_action_async function), I added change so that g_hash_table_remove function might certainly be called. As of now, segfault has not happened. diff --git a/lib/services/services.c b/lib/services/services.c index cb02511..73d0492 100644 --- a/lib/services/services.c +++ b/lib/services/services.c @@ -347,9 +347,9 @@ services_action_free(svc_action_t * op) } gboolean -cancel_recurring_action(svc_action_t * op) +cancel_recurring_action(svc_action_t * op, gboolean force_remove) { - if (op->pid) { + if (force_remove == FALSE && op->pid) { return FALSE; } @@ -378,7 +378,7 @@ services_action_cancel(const char *name, const char *action, int interval) return FALSE; } - if (cancel_recurring_action(op)) { + if (cancel_recurring_action(op, FALSE)) { op->status = PCMK_LRM_OP_CANCELLED; if (op->opaque->callback) { op->opaque->callback(op); diff --git a/lib/services/services_linux.c b/lib/services/services_linux.c index 7060be0..a02f8d9 100644 --- a/lib/services/services_linux.c +++ b/lib/services/services_linux.c @@ -235,7 +235,7 @@ operation_finalize(svc_action_t * op) if (op->interval) { if (op->cancel) { op->status = PCMK_LRM_OP_CANCELLED; - cancel_recurring_action(op); + cancel_recurring_action(op, TRUE); } else { recurring = 1; op->opaque->repeat_timer = g_timeout_add(op->interval, diff --git a/lib/services/services_private.h b/lib/services/services_private.h index dd759e3..72dc1ba 100644 --- a/lib/services/services_private.h +++ b/lib/services/services_private.h @@ -45,7 +45,7 @@ GList *resources_os_list_ocf_agents(const char *provider); GList *resources_os_list_nagios_agents(void); -gboolean cancel_recurring_action(svc_action_t * op); +gboolean cancel_recurring_action(svc_action_t * op, gboolean force_remove); gboolean recurring_action_timer(gpointer data); gboolean operation_finalize(svc_action_t * op); >> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >>> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org