Victor Do Nascimento <victor.donascime...@arm.com> writes: > The armv9.4-a architectural revision adds three new atomic operations > associated with the LSE128 feature: > > * LDCLRP - Atomic AND NOT (bitclear) of a location with 128-bit > value held in a pair of registers, with original data loaded into > the same 2 registers. > * LDSETP - Atomic OR (bitset) of a location with 128-bit value held > in a pair of registers, with original data loaded into the same 2 > registers. > * SWPP - Atomic swap of one 128-bit value with 128-bit value held > in a pair of registers. > > This patch adds the logic required to make use of these when the > architectural feature is present and a suitable assembler available. > > In order to do this, the following changes are made: > > 1. Add a configure-time check to check for LSE128 support in the > assembler. > 2. Edit host-config.h so that when N == 16, nifunc = 2. > 3. Where available due to LSE128, implement the second ifunc, making > use of the novel instructions. > 4. For atomic functions unable to make use of these new > instructions, define a new alias which causes the _i1 function > variant to point ahead to the corresponding _i2 implementation. > > libatomic/ChangeLog: > > * Makefile.am (AM_CPPFLAGS): add conditional setting of > -DHAVE_FEAT_LSE128. > * acinclude.m4 (LIBAT_TEST_FEAT_LSE128): New. > * config/linux/aarch64/atomic_16.S (LSE128): New macro > definition. > (libat_exchange_16): New LSE128 variant. > (libat_fetch_or_16): Likewise. > (libat_or_fetch_16): Likewise. > (libat_fetch_and_16): Likewise. > (libat_and_fetch_16): Likewise. > * config/linux/aarch64/host-config.h (IFUNC_COND_2): New. > (IFUNC_NCOND): Add operand size checking. > (has_lse2): Renamed from `ifunc1`. > (has_lse128): New. > (HAS_LSE128): Likewise. > * libatomic/configure.ac: Add call to LIBAT_TEST_FEAT_LSE128. > * configure (ac_subst_vars): Regenerated via autoreconf. > * libatomic/Makefile.in: Likewise. > * libatomic/auto-config.h.in: Likewise. > --- > libatomic/Makefile.am | 3 + > libatomic/Makefile.in | 1 + > libatomic/acinclude.m4 | 19 +++ > libatomic/auto-config.h.in | 3 + > libatomic/config/linux/aarch64/atomic_16.S | 170 ++++++++++++++++++- > libatomic/config/linux/aarch64/host-config.h | 29 +++- > libatomic/configure | 59 ++++++- > libatomic/configure.ac | 1 + > 8 files changed, 276 insertions(+), 9 deletions(-) > > [...] > diff --git a/libatomic/acinclude.m4 b/libatomic/acinclude.m4 > index f35ab5b60a5..4197db8f404 100644 > --- a/libatomic/acinclude.m4 > +++ b/libatomic/acinclude.m4 > @@ -83,6 +83,25 @@ AC_DEFUN([LIBAT_TEST_ATOMIC_BUILTIN],[ > ]) > ]) > > +dnl > +dnl Test if the host assembler supports armv9.4-a LSE128 isns. > +dnl > +AC_DEFUN([LIBAT_TEST_FEAT_LSE128],[ > + AC_CACHE_CHECK([for armv9.4-a LSE128 insn support], > + [libat_cv_have_feat_lse128],[ > + AC_LANG_CONFTEST([AC_LANG_PROGRAM([],[asm(".arch armv9-a+lse128")])]) > + if AC_TRY_EVAL(ac_link); then
ac_compile should be enough for this. The link step isn't really adding anything. > + eval libat_cv_have_feat_lse128=yes > + else > + eval libat_cv_have_feat_lse128=no > + fi > + rm -f conftest* > + ]) > + LIBAT_DEFINE_YESNO([HAVE_FEAT_LSE128], [$libat_cv_have_feat_lse128], > + [Have LSE128 support for 16 byte integers.]) > + AM_CONDITIONAL([ARCH_AARCH64_HAVE_LSE128], [test > x$libat_cv_have_feat_lse128 = xyes]) > +]) > + > dnl > dnl Test if we have __atomic_load and __atomic_store for mode $1, size $2 > dnl > [...] > @@ -206,6 +211,31 @@ ENTRY (libat_exchange_16) > END (libat_exchange_16) > > > +#if HAVE_FEAT_LSE128 > +ENTRY_FEAT (libat_exchange_16, LSE128) > + mov tmp0, x0 > + mov res0, in0 > + mov res1, in1 > + cbnz w4, 1f > + > + /* RELAXED. */ > + swpp res0, res1, [tmp0] > + ret > +1: > + cmp w4, ACQUIRE > + b.hi 2f > + > + /* ACQUIRE/CONSUME. */ > + swppa res0, res1, [tmp0] > + ret > + > + /* RELEASE/ACQ_REL/SEQ_CST. */ > +2: swppal res0, res1, [tmp0] > + ret > +END_FEAT (libat_exchange_16, LSE128) > +#endif Is there no benefit to using SWPPL for RELEASE here? Similarly for the others. Looks good otherwise. Thanks, Richard