On Mon, Feb 26, 2018 at 10:45 PM, Ruslan Nikolaev via gcc
<gcc@gcc.gnu.org> wrote:
> Thanks, everyone, for the output, it is very useful. I am just proposing to 
> consider the change unless there are clear roadblocks. (Either design choice 
> is probably OK with respect to the standard formally speaking, but there are 
> some clear advantages also.) I wrote a summary of pros & cons (which, of 
> course, is slightly biased towards the change :) )
> I also opened Bug 84563 with the rationale.
>
>
> Pros of the proposed approach:
> 1. Ability to use guaranteed lock-free double-width atomics (when mcx16 is 
> specified for x86-64, and always for arm64) in more or less portable manner 
> across different supported architectures (without resorting to non-standard 
> extensions or writing separate assembly code for each architecture). 
> Hopefully, the behavior may also be made more or less consistent across 
> different compilers over time. It is already the case for clang/llvm. As 
> mentioned, double-width lock-free atomics have real practical use (ABA tags 
> for pointers).
>
> 2. More likely to find a bug immediately if a programmer tries to do 
> something that is not guaranteed by the standard (i.e., getting segfault on 
> read-only memory when using double-width atomic_load). This is true even if 
> mcx16 is not used, as most CPUs have cmpxchg16b, and libatomic will use it.On 
> the other hand, atomic_load implemented through locks may have hard-to-find 
> and debug issues in signal handlers, interrupt contexts, etc when a 
> programmer erroneously assumes that atomic_load is non-blocking
>
> 3. For arm64 the corresponding instructions are always available, no need for 
> mcx16 flag or redirection to libatomic at all (libatomic may still keep old 
> implementation for backward compatibility).

That is going to create an ABI break on AArch64. Think about binaries
produced by old releases GCC that use locks in libatomic and those
used by new GCC. The way to fix this in AArch64 if there is a
guarantee from the standard that there are no  problems with read-only
locations is to implement the change in libatomic. You cannot have the
same region of memory protected by locks in older binaries and the
appropriate load / store instructions in new binaries.

Ramana


> 4. Faster & easy to analyze code when mcx16 is specified.
>
> 5. Ability to tell for sure if the implementation is lock-free by checking 
> corresponding C11 flag when mcx16 is specified. When unspecified, the flag 
> will be false to accommodate the worse-case scenario.
>
> 6. Consistent behavior everywhere on all platforms regardless of IFFUNC, 
> mcx16 flag, etc. If cmpxchg16b is available, it is always used (platforms 
> that do not support IFFUNC will use function pointers for redirection). The 
> only thing the mcx16 flag changes is removing indirection to libatomic and 
> giving guaranteed lock_free flag for corresponding types. (BTW, in practice, 
> if you use the flag, you should know what you are doing already)
>
> 7. Ability to finally deprecate old __sync builtins, and use new and more 
> advanced __atomic everywhere.
>
>
> Cons of the proposed approach:
>
> 1. Compiler may place const atomic objects to .rodata. (Avoided by making 
> sure _Atomic objects with the size > 8 are not placed in .rodata + clarifying 
> that casting random .rodata objects for double-width atomics is undefined and 
> is not allowed.)
>
> 2. Backward compatibility concerns if used outside glibc/IFFUNC. Most likely, 
> even in this case, not an issue since all calls there are already redirected 
> to libatomic anyway, and statically-linked binaries will not interact with 
> new binaries directly.
> 3. Read-only memory for atomic_load will not be supported for double-width 
> types. But it is actually better than hiding the problem under the carpet 
> (current behavior is actually even worse because it is inconsistent across 
> different platforms, i.e. different for x86-64 in Linux and arm64). Anyway, 
> it is better to use a lock-based approach explicitly if for whatever reason 
> it is more preferable (read-only memory, performance (?), etc).
> -- Ruslan

Reply via email to