I just sent v2’s of the first patches discussed so far. Jarno
On Jul 4, 2014, at 6:15 AM, Jarno Rajahalme <jrajaha...@nicira.com> wrote: > Additional data point: I used > http://en.cppreference.com/w/cpp/atomic/memory_order as a reference, and it > reads: > > “Typical use for relaxed memory ordering is updating counters, such as the > reference counters of std::shared_ptr, since this only requires atomicity, > but not ordering or synchronization.” > > I guess the stress should be on the words “typical use”, read on... > > Looking deeper into std::shared_ptr reveals that they do add synchronization > before the memory is recycled, see, e.g. > https://gcc.gnu.org/ml/libstdc++/2005-11/msg00136.html. The GCC > implementation avoids the additional synchronization, if the atomic operation > already provides synchronization (see > /usr/include/c++/4.7/bits/shared_ptr_base.h). > > From this I see that the commentary you provided is spot on. However, we > *typically use* either locks, mutexes, or RCU to protect the access to the > shared objects in OVS. All these already provide full memory barriers, so no > additional barriers in unref are needed. I plan to post a v2 with an explicit > ovs_refcount_unref_relaxed(), which can be used in the situations mentioned > above, in addition to adding the mentioned barriers to bare > ovs_refcount_unref(). > > Jarno > > On Jul 3, 2014, at 3:35 PM, Jarno Rajahalme <jrajaha...@nicira.com> wrote: > >> It seems I was too invested in the combined refcount/RCU case here. I still >> think that with RCU postponed destruction relaxed is the proper memory >> model. So maybe we should add a relaxed variant of the unref function to be >> used with RCU objects and make the normal unref use release to guarantee >> writes to the protected object are done before the reference is dropped. >> >> I do not yet fully understand the need for acquire before the delete. Is the >> concern that the current thread might immediately recycle the memory and >> writes to it (e.g., initialization) might be reordered to happen before the >> atomic sub while other threads might still be using the object? >> >> Jarno >> >>> On Jul 3, 2014, at 11:52 PM, Ben Pfaff <b...@nicira.com> wrote: >>> >>>> On Mon, Jun 30, 2014 at 08:17:18AM -0700, Jarno Rajahalme wrote: >>>> Updating the reference count only requires atomicity, but no memory >>>> ordering with respect to any other loads or stores. Avoiding the >>>> overhead of the default memory_order_seq_cst can make these more >>>> efficient. >>>> >>>> Signed-off-by: Jarno Rajahalme <jrajaha...@nicira.com> >>> >>> This website makes pretty different claims: >>> >>> http://www.chaoticmind.net/~hcb/projects/boost.atomic/doc/atomic/usage_examples.html >>> >>> Relevant excerpts: >>> >>> #include <boost/intrusive_ptr.hpp> >>> #include <boost/atomic.hpp> >>> >>> class X { >>> public: >>> typedef boost::intrusive_ptr<X> pointer; >>> X() : refcount_(0) {} >>> >>> private: >>> mutable boost::atomic<int> refcount_; >>> friend void intrusive_ptr_add_ref(const X * x) >>> { >>> x->refcount_.fetch_add(1, boost::memory_order_relaxed); >>> } >>> friend void intrusive_ptr_release(const X * x) >>> { >>> if (x->refcount_.fetch_sub(1, boost::memory_order_release) == 1) { >>> boost::atomic_thread_fence(boost::memory_order_acquire); >>> delete x; >>> } >>> } >>> }; >>> >>> ... >>> >>> Increasing the reference counter can always be done with >>> memory_order_relaxed: New references to an object can only be formed >>> from an existing reference, and passing an existing reference from one >>> thread to another must already provide any required synchronization. >>> >>> It is important to enforce any possible access to the object in one >>> thread (through an existing reference) to happen before deleting the >>> object in a different thread. This is achieved by a "release" >>> operation after dropping a reference (any access to the object through >>> this reference must obviously happened before), and an "acquire" >>> operation before deleting the object. >>> >>> It would be possible to use memory_order_acq_rel for the fetch_sub >>> operation, but this results in unneeded "acquire" operations when the >>> reference counter does not yet reach zero and may impose a performance >>> penalty. > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev