Re: cxx-mem-model merge [3 of 9] doc

Andrew MacLeod Fri, 04 Nov 2011 00:49:50 -0700

On 11/03/2011 08:04 PM, Joseph S. Myers wrote:

On Thu, 3 Nov 2011, Andrew MacLeod wrote:

Index: doc/extend.texi

Generally watch the line lengths in this patch - you should rewrap the
paragraphs to a width of about 70 characters (no more than 80) before
putting them on trunk.  @item, @deftypefn etc. lines that can't be wrapped
may be longer - but paragraphs of text should have line lengths no more
than 80 characters.

how very 80's! To be honest, I didn't realize I had full lengthparagraphs. I would have sworn i had taken care of 80 column issues...


Done.

+ @section Built-in functions for memory model aware atomic operations.

No "." at the end of a section name.  There should be a corresponding
@node as well (normally have a one-to-one correspondence between nodes and
the structure of the printed manual).  Don't try to put a section inside a
table (you can probably use @heading, but I'd still recommend keeping this
outside the section about __sync functions - or using @subsection to have
subsections for the __sync and __atomic cases).

section inside a table?  yoinks. that wasn't obvious to me at all.  oops.

I make no pretences to understand .texi stuff :-P but that was a clearoversight.

+ The following builtins approximately match the requirements for
+ C++11 memory model. Many are similar to the ``__sync'' prefixed builtins, but
+ all also have a memory model parameter.  These are all identified by being
+ prefixed with ``__atomic'', and most are overloaded such that they work
+ with multiple types.

@samp{__sync}, @samp{__atomic}, generally use @samp or @code for anything
that is quoting source code text (this includes type names such as size_t,
bits of function names such as compare_exchange, etc.).  Not listed
separately below.

OK, I think I covered those.

+ @item @var{type} __atomic_load_n (@var{type} *ptr, int memmodel)
+ @findex __atomic_load_n

As noted I think putting this inside the existing table is a mistake.  The
preferred approach for new documentation is definitely to use @deftypefn
for functions - although converting the __sync_* documentation would be a
separate matter for anyone wishing to do so.

I clearly missed something here... I didnt even realize I was in a table:-P odd tho, because the output looked decent enough that I nevernoticed I was doing something whacky.. converted to @deftypefn

+ This builtin implements the generic version of __atomic_compare_exchange. The 
function is virtually identical to  __atomic_compare_exchange_n, except the 
desired value is also a pointer.

The noun used in documentation is "built-in function" not "builtin".  See
codingconventions.html.  Likewise in several other places in this patch.

Done.

So here is the updated patch. Given my vast and deep knowledge of tex,I expect a few iterations :-) It seems OK when I look at it, of course,it did before too :-P


Oh, and changes to md.texi as well.
Andrew

        * extend.texi: Document __atomic built-in functions.
        * invoke.texi: Document data race parameters.
        * md.texi: Document atomic patterns.


Index: extend.texi
===================================================================
*** extend.texi (revision 180839)
--- extend.texi (working copy)
*************** extensions, accepted by GCC in C90 mode 
*** 79,85 ****
  * Return Address::      Getting the return or frame address of a function.
  * Vector Extensions::   Using vector instructions through built-in functions.
  * Offsetof::            Special syntax for implementing @code{offsetof}.
! * Atomic Builtins::     Built-in functions for atomic memory access.
  * Object Size Checking:: Built-in functions for limited buffer overflow
                          checking.
  * Other Builtins::      Other built-in functions.
--- 79,86 ----
  * Return Address::      Getting the return or frame address of a function.
  * Vector Extensions::   Using vector instructions through built-in functions.
  * Offsetof::            Special syntax for implementing @code{offsetof}.
! * __sync Builtins::     Legacy built-in functions for atomic memory access.
! * __atomic Builtins::   Atomic built-in functions with memory model.
  * Object Size Checking:: Built-in functions for limited buffer overflow
                          checking.
  * Other Builtins::      Other built-in functions.
*************** is a suitable definition of the @code{of
*** 6682,6689 ****
  may be dependent.  In either case, @var{member} may consist of a single
  identifier, or a sequence of member accesses and array references.
  
! @node Atomic Builtins
! @section Built-in functions for atomic memory access
  
  The following builtins are intended to be compatible with those described
  in the @cite{Intel Itanium Processor-specific Application Binary Interface},
--- 6683,6690 ----
  may be dependent.  In either case, @var{member} may consist of a single
  identifier, or a sequence of member accesses and array references.
  
! @node __sync Builtins
! @section Legacy __sync built-in functions for atomic memory access
  
  The following builtins are intended to be compatible with those described
  in the @cite{Intel Itanium Processor-specific Application Binary Interface},
*************** previous memory loads have been satisfie
*** 6815,6820 ****
--- 6816,7051 ----
  are not prevented from being speculated to before the barrier.
  @end table
  
+ @node __atomic Builtins
+ @section Built-in functions for memory model aware atomic operations
+ 
+ The following built-in functions approximately match the requirements for
+ C++11 memory model. Many are similar to the @samp{__sync} prefixed built-in
+ functions, but all also have a memory model parameter.  These are all
+ identified by being prefixed with @samp{__atomic}, and most are overloaded
+ such that they work with multiple types.
+ 
+ GCC will allow any integral scalar or pointer type that is 1, 2, 4, or 8
+ bytes in length. 16 bytes integral types are also allowed if
+ @samp{__int128_t} is supported by the architecture.
+ 
+ Target architectures are encouraged to provide their own patterns for
+ each of these built-in functions.  If no target is provided, the original 
+ non-memory model set of @samp{__sync} atomic built-in functions will be
+ utilized, along with any required synchronization fences surrounding it in
+ order to achieve the proper behaviour.  Execution in this case is subject
+ to the same restrictions as those built-in functions.
+ 
+ If there is no pattern or mechanism to provide a lock free instruction
+ sequence, a call is made to an external routine with the same parameters
+ to be resolved at runtime.
+ 
+ The four non-arithmetic functions (load, store, exchange, and 
+ compare_exchange) all have a generic version as well.  This generic
+ version will work on any data type.  If the data type size maps to one
+ of the integral sizes which may have lock free support, the generic
+ version will utilize the lock free built-in function.  Otherwise an
+ external call is left to be resolved at runtime.  This external call will
+ be the same format with the addition of a @samp{size_t} parameter inserted
+ as the first parameter indicating the size of the object being pointed to.
+ All objects must be the same size.
+ 
+ There are 6 different memory models which can be specified.  These map
+ to the same names in the C++11 standard.  Refer there or to the GCC wiki
+ on atomics for more detailed definitions.  These memory models integrate
+ both barriers to code motion as well as synchronization requirements with
+ other threads. These are listed in approximately ascending order of
+ strength.
+ 
+ @table  @code
+ @item __ATOMIC_RELAXED
+ No barriers or synchronization.
+ @item __ATOMIC_CONSUME
+ Data dependency only for both barrier and synchronization with another
+ thread.
+ @item __ATOMIC_ACQUIRE
+ Barrier to hoisting of code and synchronizes with release (or stronger)
+ semantic stores from another thread.
+ @item __ATOMIC_RELEASE
+ Barrier to sinking of code and synchronizes with acquire (or stronger)
+ semantic loads from another thread.
+ @item __ATOMIC_ACQ_REL
+ Full barrier in both directions and synchronizes with acquire loads and
+ release stores in another thread.
+ @item __ATOMIC_SEQ_CST
+ Full barrier in both directions and synchronizes with acquire loads and
+ release stores in all threads.
+ @end table
+ 
+ When implementing patterns for these built-in functions , the memory model
+ parameter can be ignored as long as the pattern implements the most
+ restrictive __ATOMIC_SEQ_CST model.  Any of the other memory models will
+ execute correctly with this memory model but they may not execute as
+ efficiently as they could with a more appropriate implemention of the
+ relaxed requirements.
+ 
+ Note that the C++11 standard allows for the memory model parameter to be
+ determined at runtime rather than at compile time.  These built-in
+ functions will map any runtime value to __ATOMIC_SEQ_CST rather than invoke
+ a runtime library call or inline a switch statement.  This is standard
+ compliant, safe, and the simplest approach for now.
+ 
+ @deftypefn {Built-in Function} @var{type} __atomic_load_n (@var{type} *ptr, 
int memmodel)
+ This built-in function implements an atomic load operation.  It returns the
+ contents of @code{*@var{ptr}}.
+ 
+ The valid memory model variants are
+ __ATOMIC_RELAXED, __ATOMIC_SEQ_CST, __ATOMIC_ACQUIRE, and
+ __ATOMIC_CONSUME.
+ 
+ @end deftypefn
+ 
+ @deftypefn {Built-in Function} void __atomic_load (@var{type} *ptr, 
@var{type} *ret, int memmodel)
+ This is the generic version of an atomic load.  It will return the
+ contents of @code{*@var{ptr}} in @code{*@var{ret}}.
+ 
+ @end deftypefn
+ 
+ @deftypefn {Built-in Function} void __atomic_store_n (@var{type} *ptr, 
@var{type} val, int memmodel)
+ This built-in function implements an atomic store operation.  It writes 
+ @code{@var{val}} into @code{*@var{ptr}}.  On targets which are limited,
+ 0 may be the only valid value. This mimics the behaviour of
+ __sync_lock_release on such hardware.
+ 
+ The valid memory model variants are
+ __ATOMIC_RELAXED, __ATOMIC_SEQ_CST, and __ATOMIC_RELEASE.
+ 
+ @end deftypefn
+ 
+ @deftypefn {Built-in Function} void __atomic_store (@var{type} *ptr, 
@var{type} *val, int memmodel)
+ This is the generic version of an atomic store.  It will store the value
+ of @code{*@var{val}} into @code{*@var{ptr}}.
+ 
+ @end deftypefn
+ 
+ @deftypefn {Built-in Function} @var{type} __atomic_exchange_n (@var{type} 
*ptr, @var{type} val, int memmodel)
+ This built-in function implements an atomic exchange operation.  It writes
+ @var{val} into @code{*@var{ptr}}, and returns the previous contents of
+ @code{*@var{ptr}}.
+ 
+ On targets which are limited, a value of 1 may be the only valid value
+ written.  This mimics the behaviour of __sync_lock_test_and_set on such
+ hardware.
+ 
+ The valid memory model variants are
+ __ATOMIC_RELAXED, __ATOMIC_SEQ_CST, __ATOMIC_ACQUIRE,
+ __ATOMIC_RELEASE, and __ATOMIC_ACQ_REL.
+ 
+ @end deftypefn
+ 
+ @deftypefn {Built-in Function} void __atomic_exchange (@var{type} *ptr, 
@var{type} *val, @var{type} *ret, int memmodel)
+ This is the generic version of an atomic exchange.  It will store the
+ contents of @code{*@var{val}} into @code{*@var{ptr}}. The original value
+ of @code{*@var{ptr}} will be copied into @code{*@var{ret}}.
+ 
+ @end deftypefn
+ 
+ @deftypefn {Built-in Function} bool __atomic_compare_exchange_n (@var{type} 
*ptr, @var{type} *expected, @var{type} desired, bool weak, int 
success_memmodel, int failure_memmodel)
+ This built-in function implements an atomic compare_exchange operation.
+ This compares the contents of @code{*@var{ptr}} with the contents of
+ @code{*@var{expected}} and if equal, writes @var{desired} into
+ @code{*@var{ptr}}.  If they are not equal, the current contents of
+ @code{*@var{ptr}} is written into @code{*@var{expected}}.
+ 
+ True is returned if @code{*@var{desired}} is written into
+ @code{*@var{ptr}} and the execution is considered to conform to the
+ memory model specified by @var{success_memmodel}.  There are no
+ restrictions on what memory model can be used here.
+ 
+ False is returned otherwise, and the execution is considered to conform
+ to @var{failure_memmodel}. This memory model cannot be __ATOMIC_RELEASE
+ nor __ATOMIC_ACQ_REL.  It also cannot be a stronger model than that
+ specified by @var{success_memmodel}.
+ 
+ @end deftypefn
+ 
+ @deftypefn {Built-in Function} bool __atomic_compare_exchange (@var{type} 
*ptr, @var{type} *expected, @var{type} *desired, bool weak, int 
success_memmodel, int failure_memmodel)
+ This built-in function implements the generic version of
+ __atomic_compare_exchange.  The function is virtually identical to
+ __atomic_compare_exchange_n, except the desired value is also a pointer.
+ 
+ @end deftypefn
+ 
+ @deftypefn {Built-in Function} @var{type} __atomic_add_fetch (@var{type} 
*ptr, @var{type} val, int memmodel)
+ @deftypefnx {Built-in Function} @var{type} __atomic_sub_fetch (@var{type} 
*ptr, @var{type} val, int memmodel)
+ @deftypefnx {Built-in Function} @var{type} __atomic_and_fetch (@var{type} 
*ptr, @var{type} val, int memmodel)
+ @deftypefnx {Built-in Function} @var{type} __atomic_xor_fetch (@var{type} 
*ptr, @var{type} val, int memmodel)
+ @deftypefnx {Built-in Function} @var{type} __atomic_or_fetch (@var{type} 
*ptr, @var{type} val, int memmodel)
+ @deftypefnx {Built-in Function} @var{type} __atomic_nand_fetch (@var{type} 
*ptr, @var{type} val, int memmodel)
+ These built-in functions perform the operation suggested by the name, and
+ return the result of the operation. That is,
+ 
+ @smallexample
+ @{ *ptr @var{op}= val; return *ptr; @}
+ @end smallexample
+ 
+ All memory models are valid.
+ 
+ @end deftypefn
+ 
+ @deftypefn {Built-in Function} @var{type} __atomic_fetch_add (@var{type} 
*ptr, @var{type} val, int memmodel)
+ @deftypefnx {Built-in Function} @var{type} __atomic_fetch_sub (@var{type} 
*ptr, @var{type} val, int memmodel)
+ @deftypefnx {Built-in Function} @var{type} __atomic_fetch_and (@var{type} 
*ptr, @var{type} val, int memmodel)
+ @deftypefnx {Built-in Function} @var{type} __atomic_fetch_xor (@var{type} 
*ptr, @var{type} val, int memmodel)
+ @deftypefnx {Built-in Function} @var{type} __atomic_fetch_or (@var{type} 
*ptr, @var{type} val, int memmodel)
+ @deftypefnx {Built-in Function} @var{type} __atomic_fetch_nand (@var{type} 
*ptr, @var{type} val, int memmodel)
+ These built-in functions perform the operation suggested by the name, and
+ return the value that had previously been in *ptr .  That is,
+ 
+ @smallexample
+ @{ tmp = *ptr; *ptr @var{op}= val; return tmp; @}
+ @end smallexample
+ 
+ All memory models are valid.
+ 
+ @end deftypefn
+ 
+ @deftypefn {Built-in Function} void __atomic_thread_fence (int memmodel)
+ 
+ This built-in function acts as a synchronization fence between threads
+ based on the specified memory model.
+ 
+ All memory orders are valid.
+ 
+ @end deftypefn
+ 
+ @deftypefn {Built-in Function} void __atomic_signal_fence (int memmodel)
+ 
+ This built-in function acts as a synchronization fence between a thread
+ and signal handlers based in the same thread.
+ 
+ All memory orders are valid.
+ 
+ @end deftypefn
+ 
+ @deftypefn {Built-in Function} bool __atomic_always_lock_free (size_t size)
+ 
+ This built-in function returns true if objects of size bytes will always
+ generate lock free atomic instructions for the target architecture.
+ Otherwise false is returned.
+ 
+ size must resolve to a compile time constant.
+ 
+ @smallexample
+ if (_atomic_always_lock_free (sizeof (long long)))
+ @end smallexample
+ 
+ @end deftypefn
+ 
+ @deftypefn {Built-in Function} bool __atomic_is_lock_free (size_t size)
+ 
+ This built-in function returns true if objects of size bytes will always
+ generate lock free atomic instructions for the target architecture.  If
+ it is not known to be lock free a call is made to a runtime routine named
+ __atomic_is_lock_free.
+ 
+ @end deftypefn
+ 
  @node Object Size Checking
  @section Object Size Checking Builtins
  @findex __builtin_object_size
Index: invoke.texi
===================================================================
*** invoke.texi (revision 180839)
--- invoke.texi (working copy)
*************** The maximum number of conditional stores
*** 9155,9165 ****
--- 9155,9180 ----
  if either vectorization (@option{-ftree-vectorize}) or if-conversion
  (@option{-ftree-loop-if-convert}) is disabled.  The default is 2.
  
+ @item allow-load-data-races
+ Allow optimizers to introduce new data races on loads.
+ Set to 1 to allow, otherwise to 0.  This option is enabled by default
+ unless implicitly set by the @option{-fmemory-model=} option.
+ 
  @item allow-store-data-races
  Allow optimizers to introduce new data races on stores.
  Set to 1 to allow, otherwise to 0.  This option is enabled by default
  unless implicitly set by the @option{-fmemory-model=} option.
  
+ @item allow-packed-load-data-races
+ Allow optimizers to introduce new data races on packed data loads.
+ Set to 1 to allow, otherwise to 0.  This option is enabled by default
+ unless implicitly set by the @option{-fmemory-model=} option.
+ 
+ @item allow-packed-store-data-races
+ Allow optimizers to introduce new data races on packed data stores.
+ Set to 1 to allow, otherwise to 0.  This option is enabled by default
+ unless implicitly set by the @option{-fmemory-model=} option.
+ 
  @item case-values-threshold
  The smallest number of different values for which it is best to use a
  jump-table instead of a tree of conditional branches.  If the value is
*************** This option will enable GCC to use CMPXC
*** 13016,13022 ****
  CMPXCHG16B allows for atomic operations on 128-bit double quadword (or oword)
  data types.  This is useful for high resolution counters that could be updated
  by multiple processors (or cores).  This instruction is generated as part of
! atomic built-in functions: see @ref{Atomic Builtins} for details.
  
  @item -msahf
  @opindex msahf
--- 13031,13038 ----
  CMPXCHG16B allows for atomic operations on 128-bit double quadword (or oword)
  data types.  This is useful for high resolution counters that could be updated
  by multiple processors (or cores).  This instruction is generated as part of
! atomic built-in functions: see @ref{__sync Builtins} or
! @ref{__atomic Builtins} for details.
  
  @item -msahf
  @opindex msahf
Index: md.texi
===================================================================
*** md.texi     (revision 180839)
--- md.texi     (working copy)
*************** released only after all previous memory 
*** 5628,5633 ****
--- 5628,5779 ----
  If this pattern is not defined, then a @code{memory_barrier} pattern
  will be emitted, followed by a store of the value to the memory operand.
  
+ @cindex @code{atomic_compare_and_swap@var{mode}} instruction pattern
+ @item @samp{atomic_compare_and_swap@var{mode}} 
+ This pattern, if defined, emits code for an atomic compare-and-swap
+ operation with memory model semantics.  Operand 2 is the memory on which
+ the atomic operation is performed.  Operand 0 is an output operand which
+ is set to true or false based on whether the operation succeeded.  Operand
+ 1 is an output operand which is set to the contents of the memory before
+ the operation was attempted.  Operand 3 is the value that is expected to
+ be in memory.  Operand 4 is the value to put in memory if the expected
+ value is found there.  Operand 5 is set to 1 if this compare and swap is to
+ be treated as a weak operation.  Operand 6 is the memory model to be used
+ if the operation is a success.  Operand 7 is the memory model to be used
+ if the operation fails.
+ 
+ If memory referred to in operand 2 contains the value in operand 3, then
+ operand 4 is stored in memory pointed to by operand 2 and fencing based on
+ the memory model in operand 6 is issued.  
+ 
+ If memory referred to in operand 2 does not contain the value in operand 3,
+ then fencing based on the memory model in operand 7 is issued.
+ 
+ If a target does not support weak compare_and_swap operations, or the port
+ elects not to implement weak operations, the argument in operand 5 can be
+ ignored.  Note a strong implementation must be provided.
+ 
+ If this pattern is not provided, the __atomic_compare_exchange built-in
+ functions will utilize the legacy sync_compare_and_swap pattern with a
+ seq-cst memory model.
+ 
+ @cindex @code{atomic_load@var{mode}} instruction pattern
+ @item @samp{atomic_load@var{mode}}
+ This pattern implements an atomic load operation with memory model
+ semantics.  Operand 1 is the memory address being loaded from.  Operand 0
+ is the result of the load.  Operand 2 is the memory model to be used for
+ the load operation.
+ 
+ If not present, the __atomic_load built-in function will either resort to
+ a normal load with memory barriers, or a compare_and_swap operation if
+ a normal load would not be atomic.
+ 
+ @cindex @code{atomic_store@var{mode}} instruction pattern
+ @item @samp{atomic_store@var{mode}}
+ This pattern implements an atomic store operation with memory model
+ semantics.  Operand 0 is the memory address being stored to.  Operand 1
+ is the value to be written.  Operand 2 is the memory model to be used for
+ the operation.
+ 
+ If not present, the __atomic_store built-in function will attempt to
+ perform a normal store and surround it with any required memory fences.  If
+ the store would not be atomic, then an __atomic_exchange is attempted with
+ the result being ignored.
+ 
+ @cindex @code{atomic_exchange@var{mode}} instruction pattern
+ @item @samp{atomic_exchange@var{mode}}
+ This pattern implements an atomic exchange operation with memory model
+ semantics.  Operand 1 is the memory location the operation is performed on.
+ Operand 0 is an output operand which is set to the original value contained
+ in the memory pointed to by operand 1.  Operand 2 is the value to be
+ stored.  Operand 3 is the memory model to be used.
+ 
+ If this pattern is not present, the built-in function __atomic_exchange
+ will attempt to preform the operation with a compare and swap loop.
+ 
+ @cindex @code{atomic_add@var{mode}} instruction pattern
+ @cindex @code{atomic_sub@var{mode}} instruction pattern
+ @cindex @code{atomic_or@var{mode}} instruction pattern
+ @cindex @code{atomic_and@var{mode}} instruction pattern
+ @cindex @code{atomic_xor@var{mode}} instruction pattern
+ @cindex @code{atomic_nand@var{mode}} instruction pattern
+ @item @samp{atomic_add@var{mode}}, @samp{atomic_sub@var{mode}}
+ @itemx @samp{atomic_or@var{mode}}, @samp{atomic_and@var{mode}}
+ @itemx @samp{atomic_xor@var{mode}}, @samp{atomic_nand@var{mode}}
+ 
+ These patterns emit code for an atomic operation on memory with memory
+ model semantics. Operand 0 is the memory on which the atomic operation is
+ performed.  Operand 1 is the second operand to the binary operator.
+ Operand 2 is the memory model to be used by the operation.
+ 
+ If these patterns are not defined, attempts will be made to use legacy
+ sync_op patterns, or equivilent patterns which return a result.  If none of
+ these are available a compare_and_swap loop will be used.
+ 
+ @cindex @code{atomic_fetch_add@var{mode}} instruction pattern
+ @cindex @code{atomic_fetch_sub@var{mode}} instruction pattern
+ @cindex @code{atomic_fetch_or@var{mode}} instruction pattern
+ @cindex @code{atomic_fetch_and@var{mode}} instruction pattern
+ @cindex @code{atomic_fetch_xor@var{mode}} instruction pattern
+ @cindex @code{atomic_fetch_nand@var{mode}} instruction pattern
+ @item @samp{atomic_fetch_add@var{mode}}, @samp{atomic_fetch_sub@var{mode}}
+ @itemx @samp{atomic_fetch_or@var{mode}}, @samp{atomic_fetch_and@var{mode}}
+ @itemx @samp{atomic_fetch_xor@var{mode}}, @samp{atomic_fetch_nand@var{mode}}
+ 
+ These patterns emit code for an atomic operation on memory with memory
+ model semantics, and return the original value. Operand 0 is an output 
+ operand which contains the value of the memory location before the 
+ operation was performed.  Operand 1 is the memory on which the atomic 
+ operation is performed.  Operand 2 is the second operand to the binary
+ operator.  Operand 3 is the memory model to be used by the operation.
+ 
+ If these patterns are not defined, attempts will be made to use legacy
+ sync_op patterns.  If none of these are available a compare_and_swap loop
+ will be used.
+ 
+ @cindex @code{atomic_add_fetch@var{mode}} instruction pattern
+ @cindex @code{atomic_sub_fetch@var{mode}} instruction pattern
+ @cindex @code{atomic_or_fetch@var{mode}} instruction pattern
+ @cindex @code{atomic_and_fetch@var{mode}} instruction pattern
+ @cindex @code{atomic_xor_fetch@var{mode}} instruction pattern
+ @cindex @code{atomic_nand_fetch@var{mode}} instruction pattern
+ @item @samp{atomic_add_fetch@var{mode}}, @samp{atomic_sub_fetch@var{mode}}
+ @itemx @samp{atomic_or_fetch@var{mode}}, @samp{atomic_and_fetch@var{mode}}
+ @itemx @samp{atomic_xor_fetch@var{mode}}, @samp{atomic_nand_fetch@var{mode}}
+ 
+ These patterns emit code for an atomic operation on memory with memory
+ model semantics and return the result after the operation is performed.
+ Operand 0 is an output operand which contains the value after the
+ operation.  Operand 1 is the memory on which the atomic operation is
+ performed.  Operand 2 is the second operand to the binary operator.
+ Operand 3 is the memory model to be used by the operation.
+ 
+ If these patterns are not defined, attempts will be made to use legacy
+ sync_op patterns, or equivilent patterns which return the result before
+ the operation followed by the arithmetic operation required to produce the
+ result.  If none of these are available a compare_and_swap loop will be
+ used.
+ 
+ @cindex @code{mem_thread_fence@var{mode}} instruction pattern
+ @item @samp{mem_thread_fence@var{mode}}
+ This pattern emits code required to implement a thread fence with
+ memory model semantics.  Operand 0 is the memory model to be used.
+ 
+ If this pattern is not specified, all memory models except RELAXED will
+ result in issuing a sync_synchronize barrier pattern.
+ 
+ @cindex @code{mem_signal_fence@var{mode}} instruction pattern
+ @item @samp{mem_signal_fence@var{mode}}
+ This pattern emits code required to implement a signal fence with
+ memory model semantics.  Operand 0 is the memory model to be used.
+ 
+ This pattern should impact the compiler optimizers the same way that
+ mem_signal_fence does, but it does not need to issue any barrier
+ instructions.
+ 
+ If this pattern is not specified, all memory models except RELAXED will
+ result in issuing a sync_synchronize barrier pattern.
+ 
  @cindex @code{stack_protect_set} instruction pattern
  @item @samp{stack_protect_set}

Re: cxx-mem-model merge [3 of 9] doc

Reply via email to