Re: GCC libatomic ABI specification draft

Richard Henderson Wed, 18 Jan 2017 14:24:06 -0800

On 01/17/2017 09:00 AM, Torvald Riegel wrote:

I think the ABI should set a baseline for each architecture, and the
baseline decides whether something is inlinable or not.  Thus, the
x86_64 ABI would make __int128 operations not imlinable (because of the
issues with cmpxchg16b, see above).


If users want to use capabilities beyond the baseline, they can choose
to use flags that alter/extend the ABI.  For example, if they use a flag
that explicitly enables the use of cmpxchg16b for atomics, they also
need to use a libatomic implementation built in the same way (if
possible).  This then creates a new ABI(-variant), basically.


Yes.  Other examples here are power7/power8 and armv6/armv7.

In both cases, the architecture added double-word load(-locked) andstore(-conditional) instructions. In order for us to use these newinstructions inline, libatomic must be updated to use them as well.

The general principal, in my opinion, is that extensions to the ISA shouldrequire that libatomic either be re-built, or perform runtime detection inorder to select the internal algorithm used.

In the case of arm, distributions normally either (1) build for a specific cpurevision, (2) build for old-arm + soft-fpu, (3) build for armv7 + hard-fpu. Somost distributions would not actually require a runtime check for arm.

In the case of power, I assume it's possible to run ppc64 on power8, but everypower8 system to which I have access has ppc64le deployed. Certainly ppc64lewould not need a runtime check, but it would seem prudent for ppc64 to gain aruntime check for the power8 insns.

I've made a few tests on my x86_64 machine a few weeks ago, and I didn't
see cmpxchg16b being used.  IIRC, I also looked at libatomic and didn't
see it (but I don't remember for sure).  Either way, if I should have
been wrong, and we are using cmpxchg16b for loads, this should be fixed.
Ideally, this should be fixed before the stage 3 deadline this Friday.
Such a fix might potentially break existing uses, but the earlier we fix
this, the better.

You needed to use -mcx16, or any other option (such as -march=host) thatimplies that. And, you will find that expand_atomic_load does have alarger-than-word-size fallback path that does use expand_atomic_compare_and_swap.


So, yes, there's something here that needs adjustment.

Section 3 Rationale, alternative 1: I'm wondering if the example is
correct.  For a 4-byte-aligned type of size 3, the implementation cannot
simply use 4-byte hardware-backed atomics because this will inevitably
touch the 4th byte I think, and the implementation can't know whether
this is padding or not.  Or do we expect that things like packed structs
are disallowed?


If we atomically store an unchanged value into the 4th byte, can we tell?

N3.1:  Why do you assume that 8-byte HW atomics are available on i386?
Because cmpxchg8b is available for CPUs that are the lowest i?86 we
still intend to support?

For various definitions of "we", I suppose. Red Hat certainly does not supportanything lower than i686, which does have cmpxchg8b.

I suspect that the GNU project still supports i486. I do know that glibc hasdropped support for i386.

I should note that supporting 64-bit atomics on i686 *is* possible, without theCAS problem that you describe for cmpxchg16b, because we *are* guaranteed thatthe FPU supports a 64-bit atomic load/store. And we do already handle this;see the atomic_loaddi_fpu and atomic_storedi_fpu patterns.

I'll also note that, as per above, this implies that if we build for i586-*,libatomic should provide runtime paths that detect and use i686 insns, so thatthe library is compatible with what the compiler will generate inline givenappropriate command-line options.

r~

Re: GCC libatomic ABI specification draft

Reply via email to