Following a prior discussion on this mailing list [1], I am proposing the introduction of rcu_barrier_finalize() in liburcu.
*** Use Case *** As pointed out in the e-mail thread, some applications are nesting liburcu data structures or, as part of their design, composing objects which use liburcu data structures (e.g. lttng-tools daemons). When such objects or data structures are destroyed, it is likely that their 'call_rcu' callbacks will, in turn, enqueue new 'call_rcu' work items. This leads to a "chaining" phenomenon where 'call_rcu' callbacks are added to the 'call_rcu' queue by callbacks being processed. The implication of this chaining is that the rcu_barrier() mechanism cannot be relied on to empty the 'call_rcu' queue. More specifically, an application calling rcu_barrier() on exit may spuriously leak memory since any unprocessed reclamation callback euqueued after the barrier will result in a leak. Of course, one could argue that it would be possible to refactor applications to separate object tear down from memory reclamation. However, the guarantee that an object is unreachable when its 'call_rcu' callback is invoked (during a grace period) becomes very useful to safely tear down its internal state (and release other resources) in cases where an explicit reference counting mechanism isn't being used. *** Implementation limitations *** The proposed implementation of rcu_barrier_finalize() is straightforward, basically invoking rcu_barrier() until all queues are observed to be empty. The call_rcu_mutex is released between each iteration to ensure the application can fork() without deadlocking. This ensures that all queues are empty on return, which in my use case, is a crude way of ensuring all work enqueued by the work enqueued prior to the rcu_barrier() has been processed. This design may cause rcu_barrier_finalize() to never return if the application can chain an unbounded amount of callbacks. It could be improved by ensuring that all queues have been observed to be empty at least once (and skipping them during the next iterations), therefore guaranteeing that all chained work has been executed, but foregoing the guarantee that all queues are empty on return (makes no difference in my use case). An alternative guarantee that could be offered would consist in ensuring that all callbacks _and_ whichever callbacks they may enqueue, are processed, but nothing more. I don't have a use-case for this, but it may come in handy to someone (please speak up!). However, since the implementation I have in mind does somewhat increase the overall complexity of the solution, I am not sure it is worth it. (Thoughts ?) [1] https://lists.lttng.org/pipermail/lttng-dev/2015-December/025356.html Jérémie Galarneau (1): Introduce rcu_barrier_finalize() doc/rcu-api.md | 23 ++++++++++++++++++++--- urcu-call-rcu-impl.h | 36 ++++++++++++++++++++++++++++++++++++ urcu-call-rcu.h | 1 + 3 files changed, 57 insertions(+), 3 deletions(-) -- 2.6.4 _______________________________________________ lttng-dev mailing list lttng-dev@lists.lttng.org http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev