----- On Jan 16, 2017, at 6:50 PM, Linus Torvalds torva...@linux-foundation.org 
wrote:

> Why not just make the write be a "smp_store_release()", and the read
> be a "smp_load_acquire()". That guarantees a certain amount of
> ordering. The only amount that I suspect makes sense, in fact.
> 
> But it's not clear what the problem is, so..

If we only use a smp_store_release() for the store to membarrier_exped,
the "unregister" (setting back to 0) would be OK, but not the "register",
as the following scenario shows:

Initial values:
A = B = 0

   CPU 0                                  |   CPU 1 (no-hz full)
                                          |
                                          |   membarrier(REGISTER_EXPEDITED)
                                          |     (write barrier implied by 
store-release)
                                          |     set t->membarrier_exped = 1 
(store-release imply memory barrier before store)
                                          |   store B = 1
                                          |   barrier() (compiler-level barrier)
                                          |   store A = 1
   x = load A                             |  
   membarrier(CMD_SHARED)                 |
     smp_mb() [1]                         |
     iter. on nohz cpus                   |
       if iter_t->membarrier_exped == 0   |
          (skip)                          |
     smp_mb() [2]                         |
   y = load B                             |  

Expect: if x == 1, then y == 1

CPU 0 can observe A == 1, membarrier_exped == 0, and B == 0,
because there is no memory barrier between store to
membarrier_exped and store to A on CPU 1.

What we seem to need on the registration/unregistration side
is store-acquire for registration, and store-release for
unregistration. This pairs with a load of membarrier_exped
that has both acquire and release barriers ([1] and [2] above).

> I'm not seeing how a regular fork() could possibly ever make sense to
> have the membarrier state in the newly forked process. Not that
> "fork()" is really well-defined for within a single thread anyway (it
> actually is as far as Linux is concerned, but not in POSIX, afaik).
> 
> So if there is no major reason for it, I would strongly suggest that
> _if_ all this makes sense in the first place, the membarrier thing
> should just be cleared unconditionally both for exec and for
> clone/fork.

That's fine with me!

Thanks,

Mathieu

> 
>               Linus

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

Reply via email to