Andrew Carlotti <andrew.carlo...@arm.com> writes:
> This patch skips redirect_to_specific clone for aarch64 and riscv,
> because the optimisation has two flaws:
>
> 1. It checks the value of the "target" attribute, even on targets that
> don't use this attribute for multiversioning.
>
> 2. The algorithm used is too aggressive, and will eliminate the
> indirection in some cases where the runtime choice of callee version
> can't be determined statically at compile time.  A correct would need to
> verify that:
>  - if the current caller version were selected at runtime, then the
>    chosen callee version would be eligible for selection.
>  - if any higher priority callee version were selected at runtime, then
>    a higher priority caller version would have been eligble for
>    selection (and hence the current caller version wouldn't have been
>    selected).
>
> The current checks only verify a more restrictive version of the first
> condition, and don't check the second condition at all.
>
> Fixing the optimisation properly would require implementing target hooks
> to check for implications between version attributes, which is too
> complicated for this stage.  However, I would like to see this hook
> implemented in the future, since it could also help deduplicate other
> multiversioning code.

Yeah, agreed on both counts.  And I think we should try to design the
hooks to minimise cut-&-paste between targets (possibly for target_version
only).

> Since this behaviour has existed for x86 and powerpc for a while, I
> think it's best to preserve the existing behaviour on those targets,
> unless any maintainer for those targets disagrees.
>
> ----
>
> There was a previous patch by Yangyu Chen [1] which addressed flaw 1 but
> overlooked flaw 2.  I pointed out flaw 2 at the time; as far as I can tell,
> there haven't been any further patches proposed for this issue (apologies if I
> missed something; I've not checked the mailing lists particularly thoroughly).
>
> I'm currently bootstrap and regression testing this on aarch64. Assuming thes
> tests pass, is this ok for master and backport to gcc-14?
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667527.html
>
> gcc/ChangeLog:
>
>       * multiple_target.cc
>       (redirect_to_specific_clone): Assert that "target" attribute is
>       used for FMV before checking it.
>       (ipa_target_clone): Skip redirect_to_specific_clone on some
>       targets.
>
> gcc/testsuite/ChangeLog:
>
>       * g++.target/aarch64/mv-pragma.C: New test.

The patch LGTM apart from nits:

> diff --git a/gcc/multiple_target.cc b/gcc/multiple_target.cc
> index 
> 552b9626aa7173546d3b027a2b1a3d227ad25375..d8becf4d9a9626a4ab3688ab30431487a6db770f
>  100644
> --- a/gcc/multiple_target.cc
> +++ b/gcc/multiple_target.cc
> @@ -442,7 +442,14 @@ expand_target_clones (struct cgraph_node *node, bool 
> definition)
>  
>  /* When NODE is a target clone, consider all callees and redirect
>     to a clone with equal target attributes.  That prevents multiple
> -   multi-versioning dispatches and a call-chain can be optimized.  */
> +   multi-versioning dispatches and a call-chain can be optimized.
> +
> +   This optimisation might pick the wrong version in some cases, since 
> knowing
> +   that we meet the target requirements for a matching callee version does 
> not
> +   tell us that we won't also meet the target requirements for a higher
> +   priority callee version at runtime.  Since this is longstanding behaviour

"behavior"

> +   for x86 and powerpc, we preserve it for those targets, but skip the 
> optimisation
> +   for targets that use the "target_version" attribute for multi-versioning. 
>  */

two long lines

Please give the RISC-V folks a day or so to comment, but otherwise this
is OK for trunk and GCC 14 with the comments above fixed.

Thanks,
Richard

>  
>  static void
>  redirect_to_specific_clone (cgraph_node *node)
> @@ -451,6 +458,7 @@ redirect_to_specific_clone (cgraph_node *node)
>    if (fv == NULL)
>      return;
>  
> +  gcc_assert (TARGET_HAS_FMV_TARGET_ATTRIBUTE);
>    tree attr_target = lookup_attribute ("target", DECL_ATTRIBUTES 
> (node->decl));
>    if (attr_target == NULL_TREE)
>      return;
> @@ -503,8 +511,9 @@ ipa_target_clone (void)
>    for (unsigned i = 0; i < to_dispatch.length (); i++)
>      create_dispatcher_calls (to_dispatch[i]);
>  
> -  FOR_EACH_FUNCTION (node)
> -    redirect_to_specific_clone (node);
> +  if (TARGET_HAS_FMV_TARGET_ATTRIBUTE)
> +    FOR_EACH_FUNCTION (node)
> +      redirect_to_specific_clone (node);
>  
>    return 0;
>  }
> diff --git a/gcc/testsuite/g++.target/aarch64/mv-pragma.C 
> b/gcc/testsuite/g++.target/aarch64/mv-pragma.C
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..545d0735438d4774a2cb909f28584ddbf4f54d2a
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/aarch64/mv-pragma.C
> @@ -0,0 +1,31 @@
> +/* { dg-do compile } */
> +/* { dg-require-ifunc "" } */
> +/* { dg-options "-O0" } */
> +
> +#pragma GCC target ("+sve")
> +
> +__attribute__((target_version("default")))
> +int foo ()
> +{
> +  return 1;
> +}
> +
> +__attribute__((target_version("sve2")))
> +int foo ()
> +{
> +  return 2;
> +}
> +
> +__attribute__((target_version("default")))
> +int bar ()
> +{
> +  return foo();
> +}
> +
> +__attribute__((target_version("sha3")))
> +int bar ()
> +{
> +  return foo() + 5;
> +}
> +
> +/* { dg-final { scan-assembler-times "\n\tbl\t_Z3foov\n" 2 } } */

Reply via email to