On Tue, Apr 30, 2024 at 05:10:45PM +0100, Andrew Carlotti wrote:
> Add target_version attribute to Common Function Attributes and update
> target and target_clones documentation. Move shared detail and examples
> to the Function Multiversioning page. Add target-specific details to
> target-specific pages.
>
> ---
>
> Changes since v1:
> - Various typo fixes.
> - Reordered content in 'Function multiversioning' section to put
> implementation
> details at the end (as suggested in review).
> - Dropped links to outdated wiki page, and a couple of other unhelpful
> sentences that the previous version preserved.
>
> I've built and rechecked the info output. Ok for master? And is this ok for
> the GCC-14 branch too?
>
> gcc/ChangeLog:
>
> * doc/extend.texi (Common Function Attributes): Update target
> and target_clones documentation, and add target_version.
> (AArch64 Function Attributes): Add ACLE reference and list
> supported features.
> (PowerPC Function Attributes): List supported features.
> (x86 Function Attributes): Mention function multiversioning.
> (Function Multiversioning): Update, and move shared detail here.
>
>
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index
> e290265d68d33f86a7e7ee9882cc0fd6bed00143..fefac70b5fffc350bf23db74a8fc88fa3bb99bd5
> 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -4178,17 +4178,16 @@ and @option{-Wanalyzer-tainted-size}.
> Multiple target back ends implement the @code{target} attribute
> to specify that a function is to
> be compiled with different target options than specified on the
> -command line. The original target command-line options are ignored.
> -One or more strings can be provided as arguments.
> -Each string consists of one or more comma-separated suffixes to
> -the @code{-m} prefix jointly forming the name of a machine-dependent
> -option. @xref{Submodel Options,,Machine-Dependent Options}.
> -
> +command line. One or more strings can be provided as arguments.
> The @code{target} attribute can be used for instance to have a function
> compiled with a different ISA (instruction set architecture) than the
> -default. @samp{#pragma GCC target} can be used to specify target-specific
> -options for more than one function. @xref{Function Specific Option Pragmas},
> -for details about the pragma.
> +default.
> +
> +The options supported by the @code{target} attribute are specific to each
> +target; refer to @ref{x86 Function Attributes}, @ref{PowerPC Function
> +Attributes}, @ref{ARM Function Attributes}, @ref{AArch64 Function
> Attributes},
> +@ref{Nios II Function Attributes}, and @ref{S/390 Function Attributes}
> +for details.
>
> For instance, on an x86, you could declare one function with the
> @code{target("sse4.1,arch=core2")} attribute and another with
> @@ -4211,39 +4210,26 @@ multiple options is equivalent to separating the
> option suffixes with
> a comma (@samp{,}) within a single string. Spaces are not permitted
> within the strings.
>
> -The options supported are specific to each target; refer to @ref{x86
> -Function Attributes}, @ref{PowerPC Function Attributes},
> -@ref{ARM Function Attributes}, @ref{AArch64 Function Attributes},
> -@ref{Nios II Function Attributes}, and @ref{S/390 Function Attributes}
> -for details.
> +@samp{#pragma GCC target} can be used to specify target-specific
> +options for more than one function. @xref{Function Specific Option Pragmas},
> +for details about the pragma.
> +
> +On x86, the @code{target} attribute can also be used to create multiple
> +versions of a function, compiled with different target-specific options.
> +@xref{Function Multiversioning} for more details.
>
> @cindex @code{target_clones} function attribute
> @item target_clones (@var{options})
> The @code{target_clones} attribute is used to specify that a function
> -be cloned into multiple versions compiled with different target options
> -than specified on the command line. The supported options and restrictions
> -are the same as for @code{target} attribute.
> -
> -For instance, on an x86, you could compile a function with
> -@code{target_clones("sse4.1,avx")}. GCC creates two function clones,
> -one compiled with @option{-msse4.1} and another with @option{-mavx}.
> -
> -On a PowerPC, you can compile a function with
> -@code{target_clones("cpu=power9,default")}. GCC will create two
> -function clones, one compiled with @option{-mcpu=power9} and another
> -with the default options. GCC must be configured to use GLIBC 2.23 or
> -newer in order to use the @code{target_clones} attribute.
> -
> -It also creates a resolver function (see
> -the @code{ifunc} attribute above) that dynamically selects a clone
> -suitable for current architecture. The resolver is created only if there
> -is a usage of a function with @code{target_clones} attribute.
> -
> -Note that any subsequent call of a function without @code{target_clone}
> -from a @code{target_clone} caller will not lead to copying
> -(target clone) of the called function.
> -If you want to enforce such behaviour,
> -we recommend declaring the calling function with the @code{flatten}
> attribute?
> +should be cloned into multiple versions compiled with different target
> options
> +than specified on the command line. @xref{Function Multiversioning} for more
> +details.
> +
> +@cindex @code{target_version} function attribute
> +@item target_version (@var{options})
> +The @code{target_version} attribute is used on AArch64 to create multiple
> +versions of a function, compiled with different target-specific options.
> +@xref{Function Multiversioning} for more details.
>
> @cindex @code{unavailable} function attribute
> @item unavailable
> @@ -4734,6 +4720,26 @@ Note that CPU tuning options and attributes such as
> the @option{-mcpu=},
> @option{-mcpu=} option or the @code{cpu=} attribute conflicts with the
> architectural feature rules specified above.
>
> +@subsubsection Function multiversioning
> +The @code{target_version} and @code{target_clones} attributes can be used to
> +specify multiple versions of a function. Each version enables the specified
> +set of architecture extensions, in addition to any extensions that were
> already
> +enabled at the command line or using @code{target} attributes. For general
> +details, @pxref{Function Multiversioning}. There are further
> AArch64-specific
> +details available in the
> +@uref{https://github.com/ARM-software/acle/blob/main/main/acle.md#function-multi-versioning,
> +Arm C Language Extensions (ACLE) specification}.
> +
> +Some aspects of the ACLE specification are not yet supported. In particular,
> +the currently supported feature names are @code{rng}, @code{flagm},
> @code{lse},
> +@code{fp}, @code{simd}, @code{dotprod}, @code{sm4}, @code{rdma}, @code{rdm}
> +(alias of @code{rdma}), @code{crc}, @code{sha2}, @code{sha3}, @code{aes},
> +@code{fp16}, @code{fp16fml}, @code{rcpc}, @code{rcpc3}, @code{i8mm},
> +@code{bf16}, @code{sve}, @code{f32mm}, @code{f64mm}, @code{sve2},
> +@code{sve2-aes}, @code{sve2-bitperm}, @code{sve2-sha3}, @code{sve2-sm4},
> +@code{sme}, @code{sb}, @code{predres}, @code{sme-f64f64}, @code{sme-i16i64}
> and
> +@code{sme2}.
> +
> @node AMD GCN Function Attributes
> @subsection AMD GCN Function Attributes
>
> @@ -6278,6 +6284,15 @@ default tuning specified on the command line.
> On the PowerPC, the inliner does not inline a
> function that has different target options than the caller, unless the
> callee has a subset of the target options of the caller.
> +
> +@cindex @code{target_clones} function attribute
> +@item target_clones (@var{options})
> +The @code{target_clones} attribute can be used to create multiple versions
> of a
> +function for different supported architectures, with one version for each
> +specifier in the options list. One of these version specifiers must be the
> +@code{default} version. The other supported target specifiers are
> +@code{cpu=power6}, @code{cpu=power7}, @code{cpu=power8}, @code{cpu=power9}
> and
> +@code{cpu=power10}. For more details, @pxref{Function Multiversioning}.
> @end table
>
> @node RISC-V Function Attributes
> @@ -6872,7 +6887,9 @@ will crash if the wrong kind of handler is used.
> @cindex @code{target} function attribute
> @item target (@var{options})
> As discussed in @ref{Common Function Attributes}, this attribute
> -allows specification of target-specific compilation options.
> +allows specification of target-specific compilation options. It can also be
> +used to create multiple versions of a single function
> +(@pxref{Function Multiversioning}).
>
> On the x86, the following options are allowed:
> @table @samp
> @@ -29431,11 +29448,58 @@ For the effects of the @code{hot} attribute on
> functions, see
> @section Function Multiversioning
> @cindex function versions
>
> -With the GNU C++ front end, for x86 targets, you may specify multiple
> -versions of a function, where each function is specialized for a
> -specific target feature. At runtime, the appropriate version of the
> -function is automatically executed depending on the characteristics of
> -the execution platform. Here is an example.
> +On some targets it is possible to specify multiple versions of a function,
> +where each version of the function is specialized for a different set of
> target
> +features. At runtime, characteristics of the execution platform are checked,
> +and the most appropriate version of the function is chosen to be executed
> +depending on the available architecture features. One of the versions will
> be
> +a "default" version, which will be chosen if none of the criteria for the
> other
> +versions are met.
> +
> +Function multiversioning is enabled by annotating the function versions with
> one
> +of three function attributes.
> +
> +The @code{target} attribute can be used on x86 targets. Multiversioning with
> +the @code{target} attribute is supported only in the C++ frontend. One
> version
> +must be explicitly labelled as the "default" version. The @code{target}
> +attributes will normally need to be included in any header file declarations,
> +to ensure that the correct function version is called from all translation
> +units. In translation units that don't include declarations with
> +the @code{target} attributes, any callers will always call the "default"
> +version directly.
> +
> +The @code{target_version} attribute can be used on AArch64 targets.
> +Multiversioning with the @code{target_version} attribute is supported only in
> +the C++ frontend. This attribute behaves similarly to the @code{target}
> +attribute. However, callers in any translation unit will always use the
> +versioned function, even if the @code{target_version} attributes aren't
> +included in that translation unit. This means that header files don't need
> to
> +include the @code{target_version} attributes, and the function can use a
> single
> +header file declaration (as if it were a normal unversioned function).
> +
> +The @code{target_clones} attribute can be used on AArch64, PowerPC and x86
> +targets. It behaves similarly to the @code{target_version} attribute, except
> +that only one copy of the function is included in the source file. The
> +attribute takes a list of version specifiers and produces one copy of the
> +function for each specifier. This is useful in cases where the compiler is
> +capable of generating optimized code (with autovectorization, for example)
> +using architecture features enabled only in the more specialized function
> +versions. For example, on PowerPC, compiling a function with
> +@code{target_clones("default,cpu=power9")} will create two function clones -
> +one compiled with @option{-mcpu=power9}, and another with the default
> options.
> +The @code{target_clones} attribute is available in the C, C++, D and Ada
> +frontends.
> +
> +Function multiversioning attributes do not propagate from a versioned
> +function to its callees, although a callee can still be optimised using the
> +caller's extra target features if it is inlined directly into the caller.
> +
> +For details of the target options supported on each target, refer to
> +@ref{AArch64 Function Attributes}, @ref{PowerPC Function Attributes},
> +and @ref{x86 Function Attributes}.
> +
> +Here is an example of function multiversioning on x86 using the @code{target}
> +attribute.
>
> @smallexample
> __attribute__ ((target ("default")))
> @@ -29474,16 +29538,28 @@ int main ()
> @}
> @end smallexample
>
> -In the above example, four versions of function foo are created. The
> -first version of foo with the target attribute "default" is the default
> -version. This version gets executed when no other target specific
> -version qualifies for execution on a particular platform. A new version
> -of foo is created by using the same function signature but with a
> -different target string. Function foo is called or a pointer to it is
> -taken just like a regular function. GCC takes care of doing the
> -dispatching to call the right version at runtime. Refer to the
> -@uref{https://gcc.gnu.org/wiki/FunctionMultiVersioning, GCC wiki on
> -Function Multiversioning} for more details.
> +In the above example, four versions of function foo are created. The first
> +version of @samp{foo}, with the target attribute @samp{"default"}, is the
> +default version. This version gets executed when no other target specific
> +version qualifies for execution on a particular platform. Other versions of
> +@samp{foo} are created by using the same function signature but with a
> +different target string. The function @samp{foo} can be called or a pointer
> to
> +it can be taken just like for a regular function. GCC takes care of doing
> the
> +dispatching to call the right version at runtime.
> +
> +Function multiversioning is implemented using the STT_GNU_IFUNC symbol type
> +extension to the ELF standard. This is the same mechanism used by the
> +@code{ifunc} attribute (@pxref{Common Function Attributes}). However, in
> this
> +case the compiler automatically generates a resolver function that checks
> which
> +features are available at runtime. This resolver uses GLIBC's hardware
> +capability bits, and therefore requires GCC to be configured to use GLIBC
> 2.23
> +or newer. The resolver is run once at startup, and the resulting function
> +pointer is then stored in the dynamic symbol table. When using the
> +@code{target_version} or @code{target_clones} attributes, the resolved ifunc
> +symbol uses the normal symbol name. When using the @code{target} attribute,
> +the normal symbol name is instead used by the default function version (which
> +is why @code{target} attributes for multiversioning need to be included in
> any
> +header file declarations as well).
>
> @node Type Traits
> @section Type Traits