https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104831

--- Comment #9 from Patrick O'Neill <patrick at rivosinc dot com> ---
Adding this comment to summarize why LR.aq/SC.rl doesn't appear to be
an issue.

The re-orderings shown in the previous litmus tests are allowed by the
C++/GCC definition of SEQ_CST. My reasoning in prior comments are
incorrect and should be disregarded.

The ISA manual table A.6 recommends the LR.aqrl/SC.rl pairing for
SEQ_CST because of A.6's SEQ_CST store mapping. When the SEQ_CST store
is implemented as:
fence rw,w + sw
There needs to be ordering enforced to prevent the store from being
reordered into the middle of the LR/SC pairing [1][2].

GCC has a different mapping for SEQ_CST store:
fence iorw,ow + amoswap.rl
This prevents the reordering as seen in the example [2], so we can
safely use LR.aq/SC.rl pairings [3].

Relevant litmus tests:

[1] CDSChecker Litmus test - seq_cst W-RMW
#include <stdlib.h>
#include <stdio.h>
#include <threads.h>
#include <atomic>

#include "model-assert.h"

/*
 * This shows that the c memory model enforces
 * SEQ_CST W->(RMW) ordering
 */

std::atomic_int x;
std::atomic_int y;

int resx, resy;

static void a(void *obj)
{
       x.store(1, std::memory_order_seq_cst);
       resy = y.fetch_add(1, std::memory_order_seq_cst);
}

static void b(void *obj)
{
       y.store(42, std::memory_order_seq_cst);
       resx = x.load(std::memory_order_seq_cst);
}

int user_main(int argc, char **argv)
{
       thrd_t t1, t2;

       atomic_init(&x, 0);
       atomic_init(&y, 0);

       printf("Main thread: creating 2 threads\n");

       thrd_create(&t1, (thrd_start_t)&a, NULL);
       thrd_create(&t2, (thrd_start_t)&b, NULL);

       thrd_join(t1);
       thrd_join(t2);

       printf("x: %d\n", resy);
       printf("y: %d\n", resx);

       MODEL_ASSERT(!(resy == 0 && resx == 0));

       printf("Main thread is finished\n");

       return 0;
}

[2] Herd7 Write-RMW litmus test - ISA Manual A.6 Mapping
RISCV W-RMW

(*
Table A.6 Seq_cst store along with LR.aq/SC.rl is insufficient for a
seq_cst store, seq_cst RMW mapping.
*)

{
0:x7=A; 0:x8=B;
1:x7=A; 1:x8=B;
}

    P0                  | P1          ;
    ori x1,x0,1         | ori x1,x0,1 ;
    fence rw,w          | fence rw,rw ;
    sw x1,0(x8)         | sw x1,0(x7) ;
    lr.w.aq x3,0(x7)    | fence rw,rw ;
    sc.w.rl x1,x1,0(x7) | lw x2,0(x8) ;

exists
(0:x3=0 /\ 1:x2=0)

[3] Herd7 Write-RMW litmus test - GCC Mapping
RISCV W-RMW

(*
GCC's current mapping for seq_cst store along with LR.aq/SC.rl is
sufficient for a seq_cst store, seq_cst RMW mapping.
*)

{
0:x7=A; 0:x8=B;
1:x7=A; 1:x8=B;
}

   P0                       | P1          ;
   ori x1,x0,1              | ori x1,x0,1 ;
   fence rw,w               | fence rw,rw ;
   amoswap.w.aq x0,x1,0(x8) | sw x1,0(x7) ;
   lr.w.aq x3,0(x7)         | fence rw,rw ;
   sc.w.rl x1,x1,0(x7)      | lw x2,0(x8) ;

~exists (0:x3=0 /\ 1:x2=0)

Reply via email to