Hi,
I have a testcase (from real workloads) involving C++ atomics and trying
to understand the codegen (gcc 12) for RVWMO and x86.
It does mix atomics with non-atomics so not obvious what the behavior is
intended to be hence some explicit CC of subject matter experts
(apologies for that in advance).
Test has a non-atomic store followed by an atomic_load(SEQ_CST). I
assume that unadorned direct access defaults to safest/conservative seq_cst.
extern int g;
std::atomic<int> a;
int bar_noaccessor(int n, int *n2)
{
*n2 = g;
return n + a;
}
int bar_seqcst(int n, int *n2)
{
*n2 = g;
return n + a.load(std::memory_order_seq_cst);
}
On RV (rvwmo), with current gcc 12 we get 2 full fences around the load
as prescribed by Privileged Spec, Chpater A, Table A.6 (Mappings from
C/C++ to RISC-V primitives).
_Z10bar_seqcstiPi:
.LFB382:
.cfi_startproc
lui a5,%hi(g)
lw a5,%lo(g)(a5)
sw a5,0(a1)
*fence iorw,iorw*
lui a5,%hi(a)
lw a5,%lo(a)(a5)
*fence iorw,iorw*
addw a0,a5,a0
ret
OTOH, for x86 (same default toggles) there's no barriers at all.
_Z10bar_seqcstiPi:
endbr64
movl g(%rip), %eax
movl %eax, (%rsi)
movl a(%rip), %eax
addl %edi, %eax
ret
My naive intuition was x86 TSO would require a fence before
load(seq_cst) for a prior store, even if that store was non atomic, so
ensure load didn't bubble up ahead of store.
Perhaps this begs the general question of intermixing non atomic
accesses with atomics and if that is undefined behavior or some such. I
skimmed through C++14 specification chapter Atomic Operations library
but nothing's jumping out on the topic.
Or is it much deeper, related to As-if rule or something.
Thx,
-Vineet