On Wed, 28 Feb 2024 09:36:38 PST (-0800), Patrick O'Neill wrote:
On 2/28/24 07:02, Palmer Dabbelt wrote:
On Wed, 28 Feb 2024 06:57:53 PST (-0800), jeffreya...@gmail.com wrote:
On 2/28/24 05:23, Kito Cheng wrote:
atomic_compare_and_swapsi will use lr.w and sc.w to do the atomic
operation on
RV64, however lr.w is doing sign extend to DI and compare
instruction only have
DI mode on RV64, so the expected value should be sign extend before
compare as
well, so that we can get right compare result.
gcc/ChangeLog:
PR target/114130
* config/riscv/sync.md (atomic_compare_and_swap<mode>): Sign
extend the expected value if needed.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr114130.c: New.
Nearly rejected this as I think the description was a bit ambiguous and
I thought you were extending the result of the lr.w. But it's actually
the other value you're ensuring gets properly extended.
I had the same response, but after reading it I'm not quite sure how
to say it better.
Maybe something like
atomic_compare_and_swapsi will use lr.w to do obtain the original value,
which sign extends to DI. RV64 only has DI comparisons, so we also need
to sign extend the expected value to DI as otherwise the comparison will
fail when the expected value has the 32nd bit set.
would do it? Either way
Reviewed-by: Palmer Dabbelt <pal...@rivosinc.com>
as I've managed to convince myself it's correct. We should probably
backport this one, the bug has likely been around for a while.
OK.
I was looking at the code to try and ask if we have the same bug for
the short inline CAS routines, but I've got to run to some meetings...
I don't think subword AMO CAS is impacted.
As part of the CAS we mask both the expected value [2] and the retrieved
value[1] before comparing.
I'm always a bit lost when it comes to bit arithmetic, but I think it's
OK. It smells like it's being a little loose with the
extensions/comparisons, but just looking at some generated code for this
simple case:
void foo(uint16_t *p, uint16_t *e, uint16_t *d) {
__atomic_compare_exchange(p, e, d, 0, __ATOMIC_RELAXED,
__ATOMIC_RELAXED);
}
foo:
lhu a3,0(a2)
lhu a2,0(a1)
andi a4,a0,3
li a5,65536
slliw a4,a4,3
addiw a5,a5,-1
sllw a5,a5,a4
sllw a3,a3,a4
sllw a7,a2,a4
andi a0,a0,-4
and a3,a3,a5
not t1,a5
and a7,a7,a5
1:
lr.w a6, 0(a0)
and t3, a6, a5 // Both a6 (from the lr.w) and a5
// (from the sllw) are sign extended,
// so the result in t3 is sign extended.
bne t3, a7, 1f // a7 is also sign extended (before
// and after the masking above), so
// it's safe for comparison
and t3, a6, t1
or t3, t3, a3
sc.w t3, t3, 0(a0) // The top bits of t3 end up polluted
// with sign extension, but it doesn't
// matter because of the sc.w.
bnez t3, 1b
1:
sraw a6,a6,a4
slliw a2,a2,16
slliw a5,a6,16
sraiw a2,a2,16
sraiw a5,a5,16
subw a5,a5,a2
beq a5,zero,.L1
sh a6,0(a1)
.L1:
ret
So I think we're OK -- that masking of a7 looks redundant here, but I
don't think we could get away with just
diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index 54bb0a66518..15956940032 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -456,7 +456,6 @@ (define_expand "atomic_cas_value_strong<mode>"
riscv_lshift_subword (<MODE>mode, o, shift, &shifted_o);
riscv_lshift_subword (<MODE>mode, n, shift, &shifted_n);
- emit_move_insn (shifted_o, gen_rtx_AND (SImode, shifted_o, mask));
emit_move_insn (shifted_n, gen_rtx_AND (SImode, shifted_n, mask));
enum memmodel model_success = (enum memmodel) INTVAL (operands[4]);
because we'd need the masking for when we don't know the high bits are
safe pre-shift. So maybe some sort of simplify call could help out
there, but I bet it's not really worth bothering -- the bookeeping
doesn't generally matter that much around AMOs.
- Patrick
[1]:
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/riscv/sync.md;h=54bb0a66518ae353fa4ed640339213bf5da6682c;hb=refs/heads/master#l495
[2]:
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/riscv/sync.md;h=54bb0a66518ae353fa4ed640339213bf5da6682c;hb=refs/heads/master#l459
Jeff