On Fri, 2017-10-20 at 12:47 +0200, Torvald Riegel wrote:
> On Thu, 2017-10-19 at 13:58 +0200, Mattias Rönnblom wrote:
> > Hi.
> > 
> > I have this code:
> > 
> > #include <stdatomic.h>
> > 
> > int ready;
> > int message;
> > 
> > void send_x4711(int m) {
> >      message = m*4711;
> >      atomic_thread_fence(memory_order_release);
> >      ready = 1;
> > }
> > 
> > When I compile it with GCC 7.2 -O3 -std=c11 on x86_64 it produces the 
> > following code:
> > 
> > send_x4711:
> > .LFB0:
> > .LVL0:
> >          imul    edi, edi, 4711
> > .LVL1:
> >          mov     DWORD PTR ready[rip], 1
> >          mov     DWORD PTR message[rip], edi
> >          ret
> > 
> > I expected the store to 'message' and 'ready' to be in program order.
> > 
> > Did I misunderstand the semantics of 
> > atomic_thread_fence+memory_order_release?
> 
> Yes.  You must make your program data-race-free.  This is required by
> C11.  No other thread can observe "ready" without a data race or other
> synchronization, so the fence is a noop in this program snippet.

Just to avoid confusion:  the store to "message" must, *conceptually*,
not be reordered to after the release thread fence.  What that means
precisely for the implementation depends on whether the thread fence has
to emit a HW fence, and on whether all atomics are compiler barriers.  

For example, on x86's TSO model, release fences are implicit, so if any
atomic store is a compiler barrier, one doesn't need to add a compiler
barrier; non-atomic accesses can be moved to before the release MO,
which means that in the example above the thread fence is conceptually
at the end of the function (which is fine).
If the store to "ready" were atomic (and a compiler barrier), the
release MO fence would conceptually sit right before that store:
    message = m*4711;
    atomic_thread_fence(memory_order_release);
    foo = 123;  // can move to before the fence, so can be reordered
                // freely wrt. the store to message
    atomic_store (ready, 1, memory_order_relaxed);
That depends on the relaxed atomic store to be a compiler barrier, which
it doesn't necessarily have to be in a valid implementations.

Reply via email to