From: Alfie Richards <[email protected]>

Hi All,

This updates the GCC documentation around FMV, which was pretty outdated.

It also updates the information around the ACLE (which is where ARM specifies
FMV behaviour) which was out of date as well.

I checked and this builds cleanly.

Thanks,
Alfie

-- >8 --

This updates the FMV documentation to the current state of things, including
the addition of "target_version" based FMV.

Left as much of the x86 target based FMV documentation unchanged as
the behaviour change there should be unchanged. Though highlights some of
the differences between it and target_version FMV to try avoid confusion there.

Additionally, updates the URL for the ACLE as it was out of date and removed
some out of date documentation there.

        PR c/122202

gcc/ChangeLog:

        * doc/extend.texi (target function attribute): Update to describe FMV
        behaviour.
        (target_version function attribute): New section.
        (target_clones attribute): Update to descrbe new behaviour with
        target_version.
        (Function Multiversioning): Update to discuss both target_version and
        target based FMV.
        (ARM C Language Extensions (ACLE)): Update URL and show
---
 gcc/doc/extend.texi | 152 +++++++++++++++++++++++++++++++++-----------
 1 file changed, 114 insertions(+), 38 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 384211f8b6d..bf033e49149 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3430,17 +3430,6 @@ default.  @samp{#pragma GCC target} can be used to 
specify target-specific
 options for more than one function.  @xref{Function Specific Option Pragmas},
 for details about the pragma.
 
-For instance, on an x86, you could declare one function with the
-@code{target("sse4.1,arch=core2")} attribute and another with
-@code{target("sse4a,arch=amdfam10")}.  This is equivalent to
-compiling the first function with @option{-msse4.1} and
-@option{-march=core2} options, and the second function with
-@option{-msse4a} and @option{-march=amdfam10} options.  It is up to you
-to make sure that a function is only invoked on a machine that
-supports the particular ISA it is compiled for (for example by using
-@code{cpuid} on x86 to determine what feature bits and architecture
-family are used).
-
 @smallexample
 int core2_func (void) __attribute__ ((__target__ ("arch=core2")));
 int sse3_func (void) __attribute__ ((__target__ ("sse3")));
@@ -3456,12 +3445,38 @@ Function Attributes}, @ref{PowerPC Function Attributes},
 @ref{ARM Function Attributes}, @ref{AArch64 Function Attributes},
 and @ref{S/390 Function Attributes} for details.
 
+On targets supporting @code{target} function multiversioning (x86), when using
+C++ you can declare multiple functions with the same signatures and different
+@code{target} attribute values, and the correct version will be chosen at
+load time. For instance you could define one function with the
+@code{target("sse4.1,arch=core2")} attribute and another with
+@code{target("sse4a,arch=amdfam10")}.  This is equivalent to
+compiling the first function with @option{-msse4.1} and
+@option{-march=core2} options, and the second function with
+@option{-msse4a} and @option{-march=amdfam10} options.
+
+Functions annotated with @code{target} cannot be used in combination with
+functions annotated with @code{target_clones} to define a function set.
+
+See @pxref{Function Multiversioning} for more details.
+
+@cindex @code{target_version} function attribute
+@item target_version (@var{option})
+On targets with @code{target_version} function multiversioning (aarch64 and
+riscv) in c or c++, you can declare multiple functions with
+@code{target_version} or @code{target_clones} attributes to define a function
+version set.
+
+See @pxref{Function Multiversioning} for more details.
+
 @cindex @code{target_clones} function attribute
 @item target_clones (@var{options})
 The @code{target_clones} attribute is used to specify that a function
 be cloned into multiple versions compiled with different target options
-than specified on the command line.  The supported options and restrictions
-are the same as for @code{target} attribute.
+than specified on the command line.
+
+For the x86 and Power targets the supported options and restrictions
+are the same as for the @code{target} attribute.
 
 For instance, on an x86, you could compile a function with
 @code{target_clones("sse4.1,avx")}.  GCC creates two function clones,
@@ -3473,16 +3488,20 @@ function clones, one compiled with 
@option{-mcpu=power9} and another
 with the default options.  GCC must be configured to use GLIBC 2.23 or
 newer in order to use the @code{target_clones} attribute.
 
-It also creates a resolver function (see
-the @code{ifunc} attribute above) that dynamically selects a clone
-suitable for current architecture.  The resolver is created only if there
-is a usage of a function with @code{target_clones} attribute.
+@code{target_clones} works similarly for targets which support the
+@code{target_version} attribute (aarch64 and riscv). The attribute takes
+multiple arguments, and generates a versioned clone for each. A function
+annotated with @code{target_clones} is equivalent to the same function
+duplicated for each valid version string in the argument, where each
+version is instead annotated with @code{target_version}. This means that a
+@code{target_clones} annotated function definition can be used in combination
+with @code{target_version} annotated functions definitions and other
+@code{target_clones} annotated function definitions.
+
+For these targets the supported options and restrictions are the same as for
+the @code{target_version} attribute.
 
-Note that any subsequent call of a function without @code{target_clone}
-from a @code{target_clone} caller will not lead to copying
-(target clone) of the called function.
-If you want to enforce such behavior,
-we recommend declaring the calling function with the @code{flatten} attribute?
+See @pxref{Function Multiversioning} for more details.
 
 @cindex @code{unavailable} function attribute
 @item unavailable
@@ -18809,21 +18828,14 @@ _v4hi __builtin_arc_vsubadd4h (__v4hi, __v4hi);
 
 GCC implements extensions for C as described in the ARM C Language
 Extensions (ACLE) specification, which can be found at
-@uref{https://developer.arm.com/documentation/ihi0053/latest/}.
+@uref{https://arm-software.github.io/acle/main/}.
 
 As a part of ACLE, GCC implements extensions for Advanced SIMD as described in
 the ARM C Language Extensions Specification.  The complete list of Advanced 
SIMD
 intrinsics can be found at
-@uref{https://developer.arm.com/documentation/ihi0073/latest/}.
-The built-in intrinsics for the Advanced SIMD extension are available when
-NEON is enabled.
-
-Currently, ARM and AArch64 back ends do not support ACLE 2.0 fully.  Both
-back ends support CRC32 intrinsics and the ARM back end supports the
-Coprocessor intrinsics, all from @file{arm_acle.h}.  The ARM back end's 16-bit
-floating-point Advanced SIMD intrinsics currently comply to ACLE v1.1.
-AArch64's back end does not have support for 16-bit floating point Advanced 
SIMD
-intrinsics yet.
+@uref{https://arm-software.github.io/acle/main/}.
+The built-in intrinsics for the ARM vector extensions are available when
+the respective extensions are enabled.
 
 See @ref{ARM Options} and @ref{AArch64 Options} for more information on the
 availability of extensions.
@@ -30593,11 +30605,75 @@ For the effects of the @code{hot} attribute on 
functions, see
 @section Function Multiversioning
 @cindex function versions
 
-With the GNU C++ front end, for x86 targets, you may specify multiple
-versions of a function, where each function is specialized for a
-specific target feature.  At runtime, the appropriate version of the
-function is automatically executed depending on the characteristics of
-the execution platform.  Here is an example.
+Function Multiversioning is a mechanism that enalbles compiling multiple
+versions of a function, each specialized for different combinations of
+architecture extensions. At load time, the appropriate version is chosen.
+
+Function Multiversioning relies on the indirect function extension to the ELF
+standard, and therefore is only available when idirect funcitons are supported.
+
+There are two versions of function multi-versioning supported by GCC.
+
+For targets supporting the @code{target_version} attribute (aarch64 and riscv),
+when compiling for C or C++, a function version set can be defined by a
+combination of function definitions with @code{target_version} and
+@code{target_clones} attributes, accross translation units.
+
+For example:
+
+@smallexample
+// fmv.h:
+int foo ();
+int foo [[gnu::target_clones("sve", "sve2")]] ();
+int foo [[gnu::target_version("dotprod;priority=1")]] ();
+
+// fmv1.cc
+#include "fmv.h"
+
+int foo ()
+@{
+  // The default version of foo.
+  return 0;
+@}
+
+// fmv2.cc:
+#include "fmv.h"
+
+int foo [[gnu::target_clones("sve", "sve2")]] ()
+@{
+  // foo versions for sve and sve2
+  return 1;
+@}
+
+int foo [[gnu::target_version("dotprod")]] ()
+@{
+  // foo version for dotprod extention
+  return 2;
+@}
+
+// main.cc
+#include "fmv.h"
+
+int main ()
+@{
+  int (*p)() = &foo;
+  assert ((*p) () == foo ());
+  return 0;
+@}
+@end smallexample
+
+This example will result in 4 versions of the foo function being generated, and
+a resolver to choose the correct version at load time.
+
+For the AArch64 target GCC implements function multi-versionsing, with the 
semantics
+and version strings as specified in the @xref{ARM C Language Extensions 
(ACLE)}.
+
+For targets that support multi-versioning with the @code{target} attribute
+(x86) FMV function sets can be defined with either multiple function
+definitions with the @code{target} attribute (in C++) within a translation 
unit,
+or a single definition, with the @code{target_clones} attribute.
+
+Here is an example.
 
 @smallexample
 __attribute__ ((target ("default")))
-- 
2.34.1

Reply via email to