----- On Feb 15, 2018, at 6:49 AM, Peter Zijlstra pet...@infradead.org wrote:
> On Wed, Feb 14, 2018 at 06:53:44PM +0000, Mathieu Desnoyers wrote: >> However, given the scenario involves multiples CPUs (one doing exit_mm(), >> the other doing context switch), the actual order of perceived load/store >> can be shuffled. And AFAIU nothing prevents the CPU from ordering the >> atomic_inc() done by mmgrab(mm) _after_ the store to current->mm. >> >> I wonder if we should not simply add a smp_mb__after_atomic() into >> mmgrab() instead ? I see that e.g. futex.c does: > > Don't think so, the futex case is really rather special and I suspect > this one is too. I would much rather have explicit comments rather than > implicit works by magic. > > As per the rationale used for refcount_t, increments should be > unordered, because you ACQUIRE your object _before_ you can do the > increment. > > The futex thing is simply abusing a bunch of implied barriers and > patching up the holes in paths that didn't already imply a barrier in > order to avoid having to add explicit barriers (which had measurable > performance issues). > > And here we have explicit ordering outside of the reference counting > too, we want to ensure the reference is incremented before we modify > a second object. > > This ordering is not at all related to acquiring the reference, so > bunding it seems odd. I understand your point. Will's added barrier and comment is fine. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com