Hi,
I have a revised version of the libatomic ABI draft which tries to
accommodate Richard's comments. The new version is attached. The diff is
also appended.
Thanks,
- Bin
diff ABI.txt ABI-1.1.txt
28a29,30
> - The versioning of the library external symbols
>
47a50,57
> Note
>
> Some 64-bit x86 ISA does not support the cmpxchg16b instruction, for
> example, some early AMD64 processors and later Intel Xeon Phi co-
> processor. Whether cmpxchg16b is supported may affect the ABI
> specification for certain atomic types. We will discuss the detail
> where it has an impact.
>
101c111,112
< _Atomic __int128 16 16 N not
applicable
---
> _Atomic __int128 (with at16) 16 16 Y not
applicable
> _Atomic __int128 (w/o at16) 16 16 N not
applicable
105c116,117
< _Atomic long double 16 16 N 12
4 N
---
> _Atomic long double (with at16) 16 16 Y 12
4 N
> _Atomic long double (w/o at16) 16 16 N 12
4 N
106a119,120
> _Atomic double _Complex 16 16(8) Y 16
16(8) N
> (with at16)
107a122
> (w/o at16)
110a126,127
> _Atomic long double _Imaginary 16 16 Y 12
4 N
> (with at16)
111a129
> (w/o at16)
146a165,167
> with at16 means the ISA supports cmpxchg16b, w/o at16 means the ISA
> does not support cmpxchg16b.
>
191a213,214
> _Atomic struct {char a[16];} 16 16(1) Y
16 16(1) N
> (with at16)
192a216
> (w/o at16)
208a233,235
> with at16 means the ISA supports cmpxchg16b, w/o at16 means the ISA
> does not support cmpxchg16b.
>
246a274,276
> On the 64-bit x86 platform which supports the cmpxchg16b instruction,
> 16-byte atomic types whose alignment matches the size is inlineable.
>
303,306c333,338
< CMPXCHG16B is not always available on 64-bit x86 platforms, so 16-byte
< naturally aligned atomics are not inlineable. The support functions for
< such atomics are free to use lock-free implementation if the instruction
< is available on specific platforms.
---
> "Inlineability" is a compile time property, which in most cases depends
> only on the type. In a few cases it also depends on whether the target
> ISA supports the cmpxchg16b instruction. A compiler may get the ISA
> information by either compilation flags or inquiring the hardware
> capabilities. When the hardware capabilities information is not
available,
> the compiler should assume the cmpxchg16b instruction is not supported.
665a698,705
> The function takes the size of an object and an address which
> is one of the following three cases
> - the address of the object
> - a faked address that solely indicates the alignment of the
> object's address
> - NULL, which means that the alignment of the object matches size
> and returns whether the object is lock-free.
>
711c751
< 5. Libatomic Assumption on Non-blocking Memory Instructions
---
> 5. Libatomic symbol versioning
712a753,868
> Here is the mapfile for symbol versioning of the libatomic library
> specified by this ABI specification
>
> LIBATOMIC_1.0 {
> global:
> __atomic_load;
> __atomic_store;
> __atomic_exchange;
> __atomic_compare_exchange;
> __atomic_is_lock_free;
>
> __atomic_add_fetch_1;
> __atomic_add_fetch_2;
> __atomic_add_fetch_4;
> __atomic_add_fetch_8;
> __atomic_add_fetch_16;
> __atomic_and_fetch_1;
> __atomic_and_fetch_2;
> __atomic_and_fetch_4;
> __atomic_and_fetch_8;
> __atomic_and_fetch_16;
> __atomic_compare_exchange_1;
> __atomic_compare_exchange_2;
> __atomic_compare_exchange_4;
> __atomic_compare_exchange_8;
> __atomic_compare_exchange_16;
> __atomic_exchange_1;
> __atomic_exchange_2;
> __atomic_exchange_4;
> __atomic_exchange_8;
> __atomic_exchange_16;
> __atomic_fetch_add_1;
> __atomic_fetch_add_2;
> __atomic_fetch_add_4;
> __atomic_fetch_add_8;
> __atomic_fetch_add_16;
> __atomic_fetch_and_1;
> __atomic_fetch_and_2;
> __atomic_fetch_and_4;
> __atomic_fetch_and_8;
> __atomic_fetch_and_16;
> __atomic_fetch_nand_1;
> __atomic_fetch_nand_2;
> __atomic_fetch_nand_4;
> __atomic_fetch_nand_8;
> __atomic_fetch_nand_16;
> __atomic_fetch_or_1;
> __atomic_fetch_or_2;
> __atomic_fetch_or_4;
> __atomic_fetch_or_8;
> __atomic_fetch_or_16;
> __atomic_fetch_sub_1;
> __atomic_fetch_sub_2;
> __atomic_fetch_sub_4;
> __atomic_fetch_sub_8;
> __atomic_fetch_sub_16;
> __atomic_fetch_xor_1;
> __atomic_fetch_xor_2;
> __atomic_fetch_xor_4;
> __atomic_fetch_xor_8;
> __atomic_fetch_xor_16;
> __atomic_load_1;
> __atomic_load_2;
> __atomic_load_4;
> __atomic_load_8;
> __atomic_load_16;
> __atomic_nand_fetch_1;
> __atomic_nand_fetch_2;
> __atomic_nand_fetch_4;
> __atomic_nand_fetch_8;
> __atomic_nand_fetch_16;
> __atomic_or_fetch_1;
> __atomic_or_fetch_2;
> __atomic_or_fetch_4;
> __atomic_or_fetch_8;
> __atomic_or_fetch_16;
> __atomic_store_1;
> __atomic_store_2;
> __atomic_store_4;
> __atomic_store_8;
> __atomic_store_16;
> __atomic_sub_fetch_1;
> __atomic_sub_fetch_2;
> __atomic_sub_fetch_4;
> __atomic_sub_fetch_8;
> __atomic_sub_fetch_16;
> __atomic_test_and_set_1;
> __atomic_test_and_set_2;
> __atomic_test_and_set_4;
> __atomic_test_and_set_8;
> __atomic_test_and_set_16;
> __atomic_xor_fetch_1;
> __atomic_xor_fetch_2;
> __atomic_xor_fetch_4;
> __atomic_xor_fetch_8;
> __atomic_xor_fetch_16;
>
> local:
> *;
> };
> LIBATOMIC_1.1 {
> global:
> __atomic_feraiseexcept;
> } LIBATOMIC_1.0;
> LIBATOMIC_1.2 {
> global:
> atomic_thread_fence;
> atomic_signal_fence;
> atomic_flag_test_and_set;
> atomic_flag_test_and_set_explicit;
> atomic_flag_clear;
> atomic_flag_clear_explicit;
> } LIBATOMIC_1.1;
>
> 6. Libatomic Assumption on Non-blocking Memory Instructions
>
752,753c908,910
< So such compiler change must be accompanied by a library change, and
< the ABI must be updated as well.
---
> In such case, the libatomic library and the compiler should be upgraded
> in lock-step, and the inlineable property for certain atomic types
> will be changed from false to true.
On 7/6/2016 12:41 PM, Richard Henderson wrote:
CMPXCHG16B is not always available on 64-bit x86 platforms, so 16-byte
naturally aligned atomics are not inlineable. The support functions for
such atomics are free to use lock-free implementation if the instruction
is available on specific platforms.
Except that it is available on almost all 64-bit x86 platforms. As far
as I know, only 2004 era AMD processors didn't have it; all Intel
64-bit cpus have supported it.
Further, gcc will most certainly make use of it when one specifies any
command-line option that enables it, such as -march=native.
Therefore we must specify that for x86_64, 16-byte objects are
non-locking on cpus that support cmpxchg16b.
However, if a compiler inlines an atomic operation on an _Atomic long
double object and uses the new lock-free instructions, it could break
the compatibility if the library implementation is still non-lock-free.
So such compiler change must be accompanied by a library change, and
the ABI must be updated as well.
The tie between gcc version and libgcc.so version is tight; I see no
reason that the libatomic.so version should not also be tight with the
compiler version.
It is sufficient that libatomic use atomic instructions when they are
available. If a new processor comes out with new capabilities, the
compiler and runtime are upgraded in lock-step.
How that is selected is beyond the ABI but possible solutions are
(1) ld.so search path, based on processor capabilities,
(2) ifunc (or workalike) where the function is selected at startup,
(3) explicit runtime test within the relevant functions.
All solutions expose the same function interface so the function call
ABI is not affected.
_Bool __atomic_is_lock_free (size_t size, void *object);
Returns whether the object pointed to by object is lock-free.
The function assumes that the size of the object is size. If object
is NULL then the function assumes that object is aligned on an
size-byte address.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65033
The actual code change is completely within libstdc++, but it affects
the description of the libatomic function.
C++ requires that is_lock_free return the same result for all objects
of a given type. Whereas __atomic_is_lock_free, with a non-null
object, determines
if we will implement lock free for a *specific* object, using the
specific object's alignment.
Rather than break the ABI and add a different function that passes the
type alignment, the solution we hit upon was to pass a "fake",
minimally aligned pointer as the object parameter: (void
*)(uintptr_t)-__alignof(type).
The final component of the ABI that you've forgotten to specify, if
you want full compatibility of linked binaries, is symbol versioning.
We have had two ABI additions since the original release. See
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libatomic/libatomic.map;h=39e7c2c6b9a70121b5f4031da346a27ae6c1be98;hb=HEAD
r~
1. Overview
1.1. Why we need an ABI for atomics
C11 standard allows different size, representation and alignment
between atomic types and the corresponding non-atomic types [1].
The size, representation and alignment of atomic types need to be
specified in the ABI specification.
A runtime support library, libatomic, already exists on Solaris
and Linux. The interface of this library needs to be standardized
as part of the ABI specification, so that
- On a system that supply libatomic, all compilers in compliance
with the ABI can generate compatible binaries linking this library.
- The binary can be backward compatible on different versions of
the system as long as they support the same ABI.
1.2. What does the atomics ABI specify
The ABI specifies the following
- Data representation of the atomic types.
- The names and behaviors of the implementation-specific support
functions.
- The versioning of the library external symbols
- The atomic types for which the compiler may generate inlined code.
- Lock-free property of the inlined atomic operations.
Note that the name and behavior of the libatomic functions specified
in the C standard do not need to be part of this ABI, because they
are already required to meet the specification in the standard.
1.3. Affected platforms
The following platforms are affected by this ABI specification.
SPARC (32-bit and 64-bit)
x86 (32-bit and 64-bit)
Section 1.1 and 1.2, and the Rationale, Notes and Appendix sections
in the rest of the document are for explanation purpose only, it
is not considered as part of the formal ABI specification.
Note
Some 64-bit x86 ISA does not support the cmpxchg16b instruction, for
example, some early AMD64 processors and later Intel Xeon Phi co-
processor. Whether cmpxchg16b is supported may affect the ABI
specification for certain atomic types. We will discuss the detail
where it has an impact.
2. Data Representation
2.1. General Rules
The general rules for size, representation and alignment of the data
representation of atomic types are the following
1) Atomic types assume the same size with the corresponding non-atomic
types.
2) Atomic types assume the same representation with the corresponding
non-atomic types.
3) Atomic types assume the same alignment with the corresponding
non-atomic types, with the following exceptions:
On 32- and 64-bit x86 platforms and on 64-bit SPARC platforms,
atomic types of size 1, 2, 4, 8 or 16-byte have the alignment
that matches the size.
On 32-bit SPARC platforms, atomic types of size 1, 2, 4 or 8-byte
have the alignment that matches the size. If the alignment of a
16-byte non-atomic type is less than 8-byte, the alignment of the
corresponding atomic type is increased to 8-byte.
Note
The above rules apply to both scalar types and aggregate types.
2.2. Atomic scalar types
x86
LP64 (AMD64)
ILP32 (i386)
C Type sizeof Alignment Inlineable sizeof
Alignment Inlineable
atomic_flag 1 1 Y 1 1
Y
_Atomic _Bool 1 1 Y 1 1
Y
_Atomic char 1 1 Y 1 1
Y
_Atomic signed char 1 1 Y 1 1
Y
_Atomic unsigned char 1 1 Y 1 1
Y
_Atomic short 2 2 Y 2 2
Y
_Atomic signed short 2 2 Y 2 2
Y
_Atomic unsigned short 2 2 Y 2 2
Y
_Atomic int 4 4 Y 4 4
Y
_Atomic signed int 4 4 Y 4 4
Y
_Atomic enum 4 4 Y 4 4
Y
_Atomic unsigned int 4 4 Y 4 4
Y
_Atomic long 8 8 Y 4 4
Y
_Atomic signed long 8 8 Y 4 4
Y
_Atomic unsigned long 8 8 Y 4 4
Y
_Atomic long long 8 8 Y 8 8
Y
_Atomic signed long long 8 8 Y 8 8
Y
_Atomic unsigned long long 8 8 Y 8 8
Y
_Atomic __int128 (with at16) 16 16 Y not
applicable
_Atomic __int128 (w/o at16) 16 16 N not
applicable
any-type _Atomic * 8 8 Y 4 4
Y
_Atomic float 4 4 Y 4 4
Y
_Atomic double 8 8 Y 8 8
Y
_Atomic long double (with at16) 16 16 Y 12 4
N
_Atomic long double (w/o at16) 16 16 N 12 4
N
_Atomic float _Complex 8 8(4) Y 8 8(4)
Y
_Atomic double _Complex 16 16(8) Y 16
16(8) N
(with at16)
_Atomic double _Complex 16 16(8) N 16
16(8) N
(w/o at16)
_Atomic long double _Complex 32 16 N 24 4
N
_Atomic float _Imaginary 4 4 Y 4 4
Y
_Atomic double _Imaginary 8 8 Y 8 8
Y
_Atomic long double _Imaginary 16 16 Y 12 4
N
(with at16)
_Atomic long double _Imaginary 16 16 N 12 4
N
(w/o at16)
SPARC
LP64 (v9)
ILP32 (sparc)
C Type sizeof Alignment Inlineable sizeof
Alignment Inlineable
atomic_flag 1 1 Y 1 1
Y
_Atomic _Bool 1 1 Y 1 1
Y
_Atomic char 1 1 Y 1 1
Y
_Atomic signed char 1 1 Y 1 1
Y
_Atomic unsigned char 1 1 Y 1 1
Y
_Atomic short 2 2 Y 2 2
Y
_Atomic signed short 2 2 Y 2 2
Y
_Atomic unsigned short 2 2 Y 2 2
Y
_Atomic int 4 4 Y 4 4
Y
_Atomic signed int 4 4 Y 4 4
Y
_Atomic enum 4 4 Y 4 4
Y
_Atomic unsigned int 4 4 Y 4 4
Y
_Atomic long 8 8 Y 4 4
Y
_Atomic signed long 8 8 Y 4 4
Y
_Atomic unsigned long 8 8 Y 4 4
Y
_Atomic long long 8 8 Y 8 8
Y
_Atomic signed long long 8 8 Y 8 8
Y
_Atomic unsigned long long 8 8 Y 8 8
Y
_Atomic __int128 16 16 N not
applicable
any-type _Atomic * 8 8 Y 4 4
Y
_Atomic float 4 4 Y 4 4
Y
_Atomic double 8 8 Y 8 8
Y
_Atomic long double 16 16 N 16 8
N
_Atomic float _Complex 8 8(4) Y 8 8(4)
Y
_Atomic double _Complex 16 16(8) N 16 8
N
_Atomic long double _Complex 32 16 N 32 8
N
_Atomic float _Imaginary 4 4 Y 4 4
Y
_Atomic double _Imaginary 8 8 Y 8 8
Y
_Atomic long double _Imaginary 16 16 N 16 8
N
with at16 means the ISA supports cmpxchg16b, w/o at16 means the ISA
does not support cmpxchg16b.
Notes:
C standard also specifies some atomic integer types. They are not
listed in the above table because they have the same representation
and alignment requirements as the corresponding direct types [2].
We will discuss the inlineable column and __int128 type in section 3.
The value in () shows the alignment of the corresponding non-atomic
type, if it is different from the alignment of the atomic type.
Because _Atomic specifier can not be used on a function type [7] and
_Atomic qualifier can not modify a function type [8], there is no
atomic function type listed in the above table.
On 32-bit x86 platforms, long double is of size 12-byte and is of
alignment 4-byte. This ABI specification does not increase the
alignment of _Atomic long double type because it would not be
lock-free even if it is 16-byte aligned, since there is no 12-byte
or 16-byte lock-free instruction on 32-bit x86 platforms.
2.3 Atomic Aggregates and Unions
Atomic structures or unions may have different alignment compared to
the corresponding non-atomic types, subject to rule 3) in section 2.1.
The alignment change only affects the boundary where an entire
structure or union is aligned. The offset of each member, the internal
padding and the size of the structure or union are not affected.
The following table shows selective examples of the size and alignment
of atomic structure types.
x86
LP64 (AMD64)
ILP32 (i386)
C Type sizeof Alignment Inlineable sizeof
Alignment Inlineable
_Atomic struct {char a[2];} 2 2(1) Y 2
2(1) Y
_Atomic struct {char a[3];} 3 1 N 3 1
N
_Atomic struct {short a[2];} 4 4(2) Y 4
4(2) Y
_Atomic struct {int a[2];} 8 8(4) Y 8
8(4) Y
_Atomic struct {char c;
int i;} 8 8(4) Y 8
8(4) Y
_Atomic struct {char c[2];
short s;
int i;} 8 8(4) Y 8
8(4) Y
_Atomic struct {char a[16];} 16 16(1) Y 16
16(1) N
(with at16)
_Atomic struct {char a[16];} 16 16(1) N 16
16(1) N
(w/o at16)
SPARC
LP64 (v9) ILP32
(sparc)
C Type sizeof Alignment Inlineable sizeof
Alignment Inlineable
_Atomic struct {char a[2];} 2 2(1) Y 2
2(1) Y
_Atomic struct {char a[3];} 3 1 N 3 1
N
_Atomic struct {short a[2];} 4 4(2) Y 4
4(2) Y
_Atomic struct {int a[2];} 8 8(4) Y 8
8(4) Y
_Atomic struct {char c;
int i;} 8 8(4) Y 8
8(4) Y
_Atomic struct {char c[2];
short s;
int i;} 8 8(4) Y 8
8(4) Y
_Atomic struct {char a[16];} 16 16(1) N 16
8(1) N
with at16 means the ISA supports cmpxchg16b, w/o at16 means the ISA
does not support cmpxchg16b.
Notes
The value in () shows the alignment of the corresponding non-atomic
type, if it is different from the alignment of the atomic type.
Because the padding of structure types is not affected by _Atomic
modifier, the contents of any padding in the atomic structure object
is still undefined, therefore the atomic compare and exchange operation
on such objects may fail due to the difference of the padding.
The increased alignment of 16-byte atomic struct types might be
useful to
- Reduce sharing locks with other atomics.
- Allow more efficient implementation of runtime support functions
for atomic operations on such types.
2.4. Bit-fields
It is implementation defined in the C standard that whether atomic
bit-field types are permitted [3]. In this ABI specification, The
representation of atomic bit-field is unspecified.
3. Lock-free and Inlineable Property
The implementation of atomic operations may map directly to hardware
atomic instructions. This kind of implementation is lock-free.
Lock-free atomic operations does not require runtime support functions.
The compiler may generate inlined code for efficiency. This ABI
specification defines a few inlineable atomic types. An atomic type
is inlineable means the compiler may generate inlined instruction
sequence for atomic operations on such types. The implementation of
the support functions for the inlineable atomic types must also be
lock free.
On all affected platforms, atomic types whose size equal to 1, 2, 4
or 8 and alignment matches the size are inlineable
On the 64-bit x86 platform which supports the cmpxchg16b instruction,
16-byte atomic types whose alignment matches the size is inlineable.
If an atomic type is not inlineable, the compiler shall always generate
support function call for the atomic operation on the object of such type.
The implementation of the support functions for non-inlineable atomic
types may be lock-free.
Rationale
It is assumed that there is no way for an atomic object to be accessed
from both lock-free operation and non-lock-free operation and the
atomic semantic can be satisfied.
If the compiler always generates runtime support function calls for
all atomics, the lock-free property would be hidden inside the library
implementation. However, the compiler may inline the atomic operations,
and we want to allow such inlining optimizations.
The compiler inlining raises an issue of mix-and-matched accesses to
the same atomic object from the compiler generated code and the runtime
library function. They have to be consistent on the lock-free property.
One possible solution to achieve the lock-free consistency is to specify
the lock-free property on a per-type basis. The C and C++ standard seem
to back this approach: C++ standard provides a query that returns a
per-type result about whether the type is lock-free [4]. C standard
does not guarantee that the query result is per-type [5], but it's the
direction it is going towards [6]. However, the query result does not
necessarily reflect the implementation of the atomic operation on the
queried type. The implementation may use lock-free instructions for
a specific object that meets certain criteria. So specifying the
lock-free property on a per-type basis is unnecessarily conservative.
It is possible to specify the lock-free property on a per-object basis.
But it is simpler to disallow the compiler to inline the atomic
operations for "may be lock-free" types in order to hide the lock-free
optimization inside the library implementation.
So the ABI achieve the lock-free consistency by specifying which types
may be inlined and specifying that those types must be lock-free, so
that for the inlineable atomic types, if there are mix-and-matched
accesses, they must both be lock-free; and for the non-inlineable atomic
types, the compiler never inlines so the mix-and-match never happens.
Notes:
Here are a few examples of small types which don't qualify as
inlineable type:
_Atomic struct {char a[3];} /* size = 3, alignment = 1 */
_Atomic long double /* (on 32-bit x86) size = 12, alignment = 4 */
A smart compiler may know such an object is located at an address that
fits in an 8-byte aligned window, but the ABI compliance behavior is
to not generate lock-free inlined code sequence, since a lazy compiler
may generate a runtime support function call which may not be
implemented lock-free.
"Inlineability" is a compile time property, which in most cases depends
only on the type. In a few cases it also depends on whether the target
ISA supports the cmpxchg16b instruction. A compiler may get the ISA
information by either compilation flags or inquiring the hardware
capabilities. When the hardware capabilities information is not available,
the compiler should assume the cmpxchg16b instruction is not supported.
4. libatomic library functions
4.1. Data Definitions
This section contains examples of system header files that provide
data interface needed by the libatomic functions.
<stdatomic.h>
typedef enum
{
memory_order_relaxed = 0,
memory_order_consume = 1,
memory_order_acquire = 2,
memory_order_release = 3,
memory_order_acq_rel = 4,
memory_order_seq_cst = 5
} memory_order;
typedef _Atomic struct
{
unsigned char __flag;
} atomic_flag;
Refer to C standard for the meaning of each enumeration constants of
memory_order type.
<fenv.h>
SPARC
#define FE_INEXACT 0x01
#define FE_DIVBYZERO 0x02
#define FE_UNDERFLOW 0x04
#define FE_OVERFLOW 0x08
#define FE_INVALID 0x10
x86
#define FE_INVALID 0x01
#define FE_DIVBYZERO 0x04
#define FE_OVERFLOW 0x08
#define FE_UNDERFLOW 0x10
#define FE_INEXACT 0x20
4.2. Support Functions
The following kinds of atomic operations are supported by the runtime
library: load, store, exchange, compare-and-exchange and arithmetic
read-modify-write operations. For the arithmetic read-modify-write
operations, the following kinds of modification operation are supported:
addition, subtraction, bitwise inclusive or, bitwise exclusive or,
bitwise and, bitwise nand. There is also classic test-and-set functions.
For each kind of atomic operations, libatomic provide a generic version
which accepts a pointer of all atomic types and a set of functions that
accept a pointer of some special atomic types which are of size
1, 2, 4 and 8- byte on all platforms and 16-byte on 64-bit platforms.
Note: Section 2.1 mentions the alignment adjustment for atomic types of
sizes 1, 2, 4, 8 and 16-byte. For load, store, exchange and compre-and-
exchange operations, it is safe to convert a pointer of any atomic types
of those sizes to the pointer of corresponding atomic integer types with
the same size.
Note: The size specific versions accept and return data by value, the
generic version use memory pointers to pass and return the data objects.
Most of the functions listed in this section can be mapped to the generic
functions with the same semantics in the C standard. Refer to the C
standard for the description of the generic functions and how each memory
order works.
The following functions are available on all platforms.
void __atomic_load (size_t size, void *object, void *loaded, memory_order
order);
Atomically load the value pointed to by object. Assign the loaded
value to the memory pointed to by loaded. The size of memory
affected by the load is designated by size.
int8_t __atomic_load_1 (int8_t *object, memory_order order);
int16_t __atomic_load_2 (int16_t *object, memory_order order);
int32_t __atomic_load_4 (int32_t *object, memory_order order);
int64_t __atomic_load_8 (int64_t *object, memory_order order);
Atomically load the value pointed to by object. The loaded value is
returned. The size of memory affected by the load is designated by
the type of the object. If object is not aligned properly according
to the type of object, the behavior is undefined.
Memory is affected according to the value of order. If order is either
memory_order_release or memory_order_acq_rel, the behavior of the
function is undefined.
void __atomic_store (size_t size, void *object, void *desired, memory_order
order)
Atomically replace the value pointed to by object with the value
pointed to by desired. The size of memory affected by the store
is designated by size.
void __atomic_store_1 (int8_t *object, int8_t desired, memory_order order);
void __atomic_store_2 (int16_t *object, int16_t desired, memory_order order);
void __atomic_store_4 (int32_t *object, int32_t desired, memory_order order);
void __atomic_store_8 (int64_t *object, int64_t desired, memory_order order);
Atomically replace the value pointed to by object with desired.
The size of memory affected by the store is designated by the
type of the object. If object is not aligned properly according
to the type of object, the behavior is undefined.
Memory is affected according to the value of order. If order is one of
memory_order_acquire, memory_order_consume or memory_order_acq_rel, the
behavior of the function is undefined.
void __atomic_exchange (size_t size, void *object, void *desired, void *loaded,
memory_model order);
Atomically, replace the value pointed to by object with the value
pointed to by desired and assign the value pointed to by loaded to
the value pointed to by object immediately before the effect. The
size of memory affected by the exchange is designated by size.
int8_t __atomic_exchange_1 (int8_t * object, int8_t desired, memory_order)
int16_t __atomic_exchange_2 (int16_t * object, int16_t desired, memory_order)
int32_t __atomic_exchange_4 (int32_t * object, int32_t desired, memory_order)
int64_t __atomic_exchange_8 (int64_t * object, int64_t desired, memory_order)
Atomically, replace the value pointed to by object with desired
and return the value pointed to by object immediately before the
effect. The size of memory affected by the exchange is designated
by the type of object. If object is not aligned properly according
to the type of object, the behavior is undefined.
Memory is affected according to the value of order.
_Bool __atomic_compare_exchange (size_t size, void *object, void *expected,
void *desired, memory_model success_order, memory_model failure_order);
Atomically, compares the memory pointed to by object for equality with
the memory pointed to by expected, and if true, replaces the memory
pointed to by object with the memory pointed to by desired, and if false,
updates the memory pointed to by expected with the memory pointed to by
object. The result of the comparison is returned. The size of memory
affected by the compare and exchange is designated by size.
The compare and exchange never fail spuriously, i.e. if the comparison
for equality returns false, the two values in the comparison were not
equal. [Note, this is to specify that on SPARC and x86, compare exchange
is always implemented with "strong" semantic. The weak flavors in the
C standard is translated to strong.]
_Bool __atomic_compare_exchange_1 (int8_t *object, int8_t *expected, int8_t
desired, memory_order success_order, memory_order failure_order);
_Bool __atomic_compare_exchange_2 (int16_t *object, int16_t *expected, int16_t
desired, memory_order success_order, memory_order failure_order);
_Bool __atomic_compare_exchange_4 (int32_t *object, int32_t *expected, int32_t
desired, memory_order success_order, memory_order failure_order);
_Bool __atomic_compare_exchange_8 (int64_t *object, int64_t *expected, int64_t
desired, memory_order success_order, memory_order failure_order);
Atomically, compares the memory pointed to by object for equality with
the memory pointed to by expected, and if true, replaces the memory
pointed to by object with desired, and if false, updates the memory
pointed to by expected with the memory pointed to by object. The
result of the comparison is returned.
The size of memory affected by the compare and exchange is designated
by the type of object. If object is not aligned properly according
to the type of object, the behavior is undefined.
The compare and exchange never fail spuriously, i.e. if the comparison
for equality returns false, the two values in the comparison were not
equal.
If the comparison is true, memory is affected according to the
value of success_order, and if the comparison is false, memory is
affected according to the value of failure_order. The result of the
comparison is returned.
int8_t __atomic_add_fetch_1 (int8_t *object, int8_t operand, memory_order
order);
int16_t __atomic_add_fetch_2 (int16_t *object, int16_t operand, memory_order
order);
int32_t __atomic_add_fetch_4 (int32_t *object, int32_t operand, memory_order
order);
int64_t __atomic_add_fetch_8 (int64_t *object, int64_t operand, memory_order
order);
Atomically replaces the value pointed to by object with the result of
the value pointed to by object plus operand and returns the value
pointed to by object immediately after the effects. If object is
not aligned properly according to the type of object, the behavior
is undefined. The size of memory affected by the effects is designated
by the type of object.
int8_t __atomic_fetch_add_1 (int8_t *object, int8_t operand, memory_order
order);
int16_t __atomic_fetch_add_2 (int16_t *object, int16_t operand, memory_order
order);
int32_t __atomic_fetch_add_4 (int32_t *object, int32_t operand, memory_order
order);
int64_t __atomic_fetch_add_8 (int64_t *object, int64_t operand, memory_order
order);
Atomically replaces the value pointed to by object with the result of
the value pointed to by object plus operand and returns the value
pointed to by object immediately before the effects. If object is
not aligned properly according to the type of object, the behavior
is undefined. The size of memory affected by the effects is designated
by the type of object.
Memory is affected according to the value of order.
int8_t __atomic_sub_fetch_1 (int8_t *object, int8_t operand, memory_order
order);
int16_t __atomic_sub_fetch_2 (int16_t *object, int16_t operand, memory_order
order);
int32_t __atomic_sub_fetch_4 (int32_t *object, int32_t operand, memory_order
order);
int64_t __atomic_sub_fetch_8 (int64_t *object, int64_t operand, memory_order
order);
Atomically replaces the value pointed to by object with the result of
the value pointed to by object minus operand and returns the value
pointed to by object immediately after the effects. If object is not
aligned properly according to the type of object, the behavior is
undefined. The size of memory affected by the effects is designated
by the type of object.
int8_t __atomic_fetch_sub_1 (int8_t *object, int8_t operand, memory_order
order);
int16_t __atomic_fetch_sub_2 (int16_t *object, int16_t operand, memory_order
order);
int32_t __atomic_fetch_sub_4 (int32_t *object, int32_t operand, memory_order
order);
int64_t __atomic_fetch_sub_8 (int64_t *object, int64_t operand, memory_order
order);
Atomically replaces the value pointed to by object with the result of
the value pointed to by object minus operand and returns the value
pointed to by object immediately before the effects. If object is
not aligned properly according to the type of object, the behavior
is undefined. The size of memory affected by the effects is
designated by the type of object.
Memory is affected according to the value of order.
int8_t __atomic_and_fetch_1 (int8_t *object, int8_t operand, memory_order
order);
int16_t __atomic_and_fetch_2 (int16_t *object, int16_t operand, memory_order
order);
int32_t __atomic_and_fetch_4 (int32_t *object, int32_t operand, memory_order
order);
int64_t __atomic_and_fetch_8 (int64_t *object, int64_t operand, memory_order
order);
Atomically, replaces the value pointed to by object with the result of
bitwise and of the value pointed to by object and operand and returns
the value pointed to by object immediately after the effects. If object
is not aligned properly according to the type of object, the behavior
is undefined. The size of memory affected by the effects is designated
by the type of object.
int8_t __atomic_fetch_and_1 (int8_t *object, int8_t operand, memory_order
order);
int16_t __atomic_fetch_and_2 (int16_t *object, int16_t operand, memory_order
order);
int32_t __atomic_fetch_and_4 (int32_t *object, int32_t operand, memory_order
order);
int64_t __atomic_fetch_and_8 (int64_t *object, int64_t operand, memory_order
order);
Atomically, replaces the value pointed to by object with the result of
bitwise and of the value pointed to by object and operand and returns
the value pointed to by object immediately before the effects. If object
is not aligned properly according to the type of object, the behavior
is undefined. The size of memory affected by the effects is designated
by the type of object.
Memory is affected according to the value of order.
int8_t __atomic_or_fetch_1 (int8_t *object, int8_t operand, memory_order order);
int16_t __atomic_or_fetch_2 (int16_t *object, int16_t operand, memory_order
order);
int32_t __atomic_or_fetch_4 (int32_t *object, int32_t operand, memory_order
order);
int64_t __atomic_or_fetch_8 (int64_t *object, int64_t operand, memory_order
order);
Atomically, replaces the value pointed to by object with the result of
bitwise or of the value pointed to by object and operand and returns
the value pointed to by object immediately after the effects. If object
is not aligned properly according to the type of object, the behavior
is undefined. The size of memory affected by the effects is designated
by the type of object.
int8_t __atomic_fetch_or_1 (int8_t *object, int8_t operand, memory_order order);
int16_t __atomic_fetch_or_2 (int16_t *object, int16_t operand, memory_order
order);
int32_t __atomic_fetch_or_4 (int32_t *object, int32_t operand, memory_order
order);
int64_t __atomic_fetch_or_8 (int64_t *object, int64_t operand, memory_order
order);
Atomically, replaces the value pointed to by object with the result of
bitwise or of the value pointed to by object and operand and returns
the value pointed to by object immediately before the effects. If object
is not aligned properly according to the type of object, the behavior
is undefined. The size of memory affected by the effects is designated
by the type of object.
Memory is affected according to the value of order.
int8_t __atomic_xor_fetch_1 (int8_t *object, int8_t operand, memory_order
order);
int16_t __atomic_xor_fetch_2 (int16_t *object, int16_t operand, memory_order
order);
int32_t __atomic_xor_fetch_4 (int32_t *object, int32_t operand, memory_order
order);
int64_t __atomic_xor_fetch_8 (int64_t *object, int64_t operand, memory_order
order);
Atomically, replaces the value pointed to by object with the result of
bitwise xor of the value pointed to by object and operand and returns
the value pointed to by object immediately after the effects. If object
is not aligned properly according to the type of object, the behavior
is undefined. The size of memory affected by the effects is designated
by the type of object.
int8_t __atomic_fetch_xor_1 (int8_t *object, int8_t operand, memory_order
order);
int16_t __atomic_fetch_xor_2 (int16_t *object, int16_t operand, memory_order
order);
int32_t __atomic_fetch_xor_4 (int32_t *object, int32_t operand, memory_order
order);
int64_t __atomic_fetch_xor_8 (int64_t *object, int64_t operand, memory_order
order);
Atomically, replaces the value pointed to by object with the result of
bitwise xor of the value pointed to by object and operand and returns
the value pointed to by object immediately before the effects. If object
is not aligned properly according to the type of object, the behavior
is undefined. The size of memory affected by the effects is designated
by the type of object.
Memory is affected according to the value of order.
int8_t __atomic_nand_fetch_1 (int8_t *object, int8_t operand, memory_order
order);
int16_t __atomic_nand_fetch_2 (int16_t *object, int16_t operand, memory_order
order);
int32_t __atomic_nand_fetch_4 (int32_t *object, int32_t operand, memory_order
order);
int64_t __atomic_nand_fetch_8 (int64_t *object, int64_t operand, memory_order
order);
Atomically, replaces the value pointed to by object with the result of
bitwise nand of the value pointed to by object and operand and returns
the value pointed to by object immediately after the effects. If object
is not aligned properly according to the type of object, the behavior
is undefined. The size of memory affected by the effects is designated
by the type of object.
Bitwise operator nand is defined as the following using ANSI C
operators: a nand b is equivalent to ~(a & b).
int8_t __atomic_fetch_nand_1 (int8_t *object, int8_t operand, memory_order
order);
int16_t __atomic_fetch_nand_2 (int16_t *object, int16_t operand, memory_order
order);
int32_t __atomic_fetch_nand_4 (int32_t *object, int32_t operand, memory_order
order);
int64_t __atomic_fetch_nand_8 (int64_t *object, int64_t operand, memory_order
order);
Atomically, replaces the value pointed to by object with the result of
bitwise nand of the value pointed to by object and operand and returns
the value pointed to by object immediately before the effects. If object
is not aligned properly according to the type of object, the behavior
is undefined. The size of memory affected by the effects is designated
by the type of object.
Bitwise operator nand is defined as the following using ANSI C
operators: a nand b is equivalent to ~(a & b).
Memory is affected according to the value of order.
_Bool __atomic_test_and_set_1 (int8_t *object, memory_order order);
_Bool __atomic_test_and_set_2 (int16_t *object, memory_order order);
_Bool __atomic_test_and_set_4 (int32_t *object, memory_order order)
_Bool __atomic_test_and_set_8 (int64_t *object, memory_order order)
Atomically, checks the value pointed to by object and if it is in
the clear state, set the value pointed to by object to the set
state and returns true, and if it is in the set state, returns false.
The size of memory affected by the effects is always one byte.
Memory is affected according to the value of order.
The set and clear state are the same as specified for
atomic_flag_test_and_set.
_Bool __atomic_is_lock_free (size_t size, void *object);
Returns whether the object pointed to by object is lock-free.
The function assumes that the size of the object is size. If object
is NULL then the function assumes that object is aligned on an
size-byte address.
The function takes the size of an object and an address which
is one of the following three cases
- the address of the object
- a faked address that solely indicates the alignment of the
object's address
- NULL, which means that the alignment of the object matches size
and returns whether the object is lock-free.
void __atomic_feraiseexcept (int exception);
Raise floating point exception(s) that specified by exception.
The int input argument exception represents a subset of
floating-point exceptions, and can be zero or the bitwise
OR of one or more floating-point exception macros. The macros
are defined in fenv.h in section 4.1.
4.3. 64-bit Specific Interfaces
4.3.1. Data Representation of __int128 type
On x86 platforms, __int128 type is defined in the 64-bit ABI.
On SPARC platforms, the size and alignment of __int128 type is
specified as the following:
sizeof Alignment
__int128 16 16
4.3.2. Support Functions
The following functions are available only on 64-bit platforms.
__int128 __atomic_load_16 (__int128 *object, memory_order order);
void __atomic_store_16 (__int128 *object, __int128 desired, memory_order order);
__int128 __atomic_exchange_16 (__int128 * object, __int128 desired,
memory_order order);
_Bool __atomic_compare_exchange_16 (__int128 *object, __int128 *expected,
__int128 desired, memory_order success_order, memory_order failure_order);
__int128 __atomic_add_fetch_16 (__int128 *object, __int128 operand,
memory_order order);
__int128 __atomic_fetch_add_16 (__int128 *object, __int128 operand,
memory_order order);
__int128 __atomic_sub_fetch_16 (__int128 *object, __int128 operand,
memory_order order);
__int128 __atomic_fetch_sub_16 (__int128 *object, __int128 operand,
memory_order order);
__int128 __atomic_and_fetch_16 (__int128 *object, __int128 operand,
memory_order order);
__int128 __atomic_fetch_and_16 (__int128 *object, __int128 operand,
memory_order order);
__int128 __atomic_or_fetch_16 (__int128 *object, __int128 operand, memory_order
order);
__int128 __atomic_fetch_or_16 (__int128 *object, __int128 operand, memory_order
order);
__int128 __atomic_xor_fetch_16 (__int128 *object, __int128 operand,
memory_order order);
__int128 __atomic_fetch_xor_16 (__int128 *object, __int128 operand,
memory_order order);
__int128 __atomic_nand_fetch_16 (__int128 *object, __int128 operand,
memory_order order);
__int128 __atomic_fetch_nand_16 (__int128 *object, __int128 operand,
memory_order order);
_Bool __atomic_test_and_set_16 (__int128 *object, memory_order order);
The description of each function is the same with the corresponding
set of functions specified in section 4.2.
5. Libatomic symbol versioning
Here is the mapfile for symbol versioning of the libatomic library
specified by this ABI specification
LIBATOMIC_1.0 {
global:
__atomic_load;
__atomic_store;
__atomic_exchange;
__atomic_compare_exchange;
__atomic_is_lock_free;
__atomic_add_fetch_1;
__atomic_add_fetch_2;
__atomic_add_fetch_4;
__atomic_add_fetch_8;
__atomic_add_fetch_16;
__atomic_and_fetch_1;
__atomic_and_fetch_2;
__atomic_and_fetch_4;
__atomic_and_fetch_8;
__atomic_and_fetch_16;
__atomic_compare_exchange_1;
__atomic_compare_exchange_2;
__atomic_compare_exchange_4;
__atomic_compare_exchange_8;
__atomic_compare_exchange_16;
__atomic_exchange_1;
__atomic_exchange_2;
__atomic_exchange_4;
__atomic_exchange_8;
__atomic_exchange_16;
__atomic_fetch_add_1;
__atomic_fetch_add_2;
__atomic_fetch_add_4;
__atomic_fetch_add_8;
__atomic_fetch_add_16;
__atomic_fetch_and_1;
__atomic_fetch_and_2;
__atomic_fetch_and_4;
__atomic_fetch_and_8;
__atomic_fetch_and_16;
__atomic_fetch_nand_1;
__atomic_fetch_nand_2;
__atomic_fetch_nand_4;
__atomic_fetch_nand_8;
__atomic_fetch_nand_16;
__atomic_fetch_or_1;
__atomic_fetch_or_2;
__atomic_fetch_or_4;
__atomic_fetch_or_8;
__atomic_fetch_or_16;
__atomic_fetch_sub_1;
__atomic_fetch_sub_2;
__atomic_fetch_sub_4;
__atomic_fetch_sub_8;
__atomic_fetch_sub_16;
__atomic_fetch_xor_1;
__atomic_fetch_xor_2;
__atomic_fetch_xor_4;
__atomic_fetch_xor_8;
__atomic_fetch_xor_16;
__atomic_load_1;
__atomic_load_2;
__atomic_load_4;
__atomic_load_8;
__atomic_load_16;
__atomic_nand_fetch_1;
__atomic_nand_fetch_2;
__atomic_nand_fetch_4;
__atomic_nand_fetch_8;
__atomic_nand_fetch_16;
__atomic_or_fetch_1;
__atomic_or_fetch_2;
__atomic_or_fetch_4;
__atomic_or_fetch_8;
__atomic_or_fetch_16;
__atomic_store_1;
__atomic_store_2;
__atomic_store_4;
__atomic_store_8;
__atomic_store_16;
__atomic_sub_fetch_1;
__atomic_sub_fetch_2;
__atomic_sub_fetch_4;
__atomic_sub_fetch_8;
__atomic_sub_fetch_16;
__atomic_test_and_set_1;
__atomic_test_and_set_2;
__atomic_test_and_set_4;
__atomic_test_and_set_8;
__atomic_test_and_set_16;
__atomic_xor_fetch_1;
__atomic_xor_fetch_2;
__atomic_xor_fetch_4;
__atomic_xor_fetch_8;
__atomic_xor_fetch_16;
local:
*;
};
LIBATOMIC_1.1 {
global:
__atomic_feraiseexcept;
} LIBATOMIC_1.0;
LIBATOMIC_1.2 {
global:
atomic_thread_fence;
atomic_signal_fence;
atomic_flag_test_and_set;
atomic_flag_test_and_set_explicit;
atomic_flag_clear;
atomic_flag_clear_explicit;
} LIBATOMIC_1.1;
6. Libatomic Assumption on Non-blocking Memory Instructions
libatomic assumes that programmers or compilers properly insert
SFENCE/MFENCE barriers for the following cases
1) writes executed with CLFLUSH instruction
2) streaming loads/stores (V)MOVNTx, MASKMOVDQU, MASKMOVQ.
3) any other operations which reference Write Combining memory type.
Rationale
x86 has a strong memory model. Memory reads are not reordered with
other reads, writes are not reordered with reads and other writes.
The three cases mentioned are exceptions, i.e. those writes will not
block other writes.
The ABI specifies that code uses those non-blocking writes should
contain proper fences, so that libatomic support functions do not need
fences to synchronize with those instructions.
Appendix
A.1. Compatibility Notes
On 64-bit SPARC platforms, _Atomic long double is a 16-byte naturally
aligned atomic type. There is no lock-free instruction for such type
in 64-bit SPARC ISA, and it is not inlineable in this ABI specification,
so the libatomic implementation have to use non-lock-free implementation
for atomic operations on such type.
If in the future, lock-free instructions for 16-byte naturally aligned
objects are available in a new SPARC ISA, then libatomic could leverage
them to implement lock-free atomic operations for _Atomic long double.
This would be a backward compatible libatomic change. The type is not
inlineable, all atomic operations on objects of the type must be via
libatomic function calls, so all non-lock-free operations will be
changed to lock-free in those libatomic functions.
However, if a compiler inlines an atomic operation on an _Atomic long
double object and uses the new lock-free instructions, it could break
the compatibility if the library implementation is still non-lock-free.
In such case, the libatomic library and the compiler should be upgraded
in lock-step, and the inlineable property for certain atomic types
will be changed from false to true.
If a compiler change the data representation of atomic types, such
change will cause incompatible binary and it would be hard to detect
if the incompatible binaries are linked together.
References
[1] C11 Standard, 6.2.5p27
The size, representation, and alignment of an atomic type need not be
the same as those of the corresponding unqualified type.
[2] C11 Standard, 7.17.6p1
For each line in the following table,257) the atomic type name is
declared as a type that has the same representation and alignment
requirements as the corresponding direct type.258)
Footnote 258
258) The same representation and alignment requirements are meant to
imply interchangeability as arguments to functions, return values from
functions, and members of unions.
[3] C11 Standard, 6.7.2.1p5
A bit-field shall have a type that is a qualified or unqualified
version of _Bool, signed int, unsigned int, or some other
implementation-defined type. It is implementation-defined whether
atomic types are permitted.
[4] C++11 Standard, 29.4p2
The function atomic_is_lock_free (29.6) indicates whether the object
is lock-free. In any given program execution, the result of the
lock-free query shall be consistent for all pointers of the same type.
[5] C11 Standard, 7.17.5.1p3
The atomic_is_lock_free generic function returns nonzero (true) if
and only if the object's operations are lock-free. The result of a
lock-free query on one object cannot be inferred from the result of
a lock-free query on another object.
[6] http://www.open-std.org/jtc1/sc22/wg14/www/docs/summary.htm#dr_465
[7] C11 Standard, 6.7.2.4p3
The type name in an atomic type specifier shall not refer to an array
type, a function type, an atomic type, or a qualified type.
[8] C11 Standard, 6.7.3p3
The type modified by the _Atomic qualifier shall not be an array type
or a function type.