Andrew Carlotti <andrew.carlo...@arm.com> writes: > This patch skips redirect_to_specific clone for aarch64 and riscv, > because the optimisation has two flaws: > > 1. It checks the value of the "target" attribute, even on targets that > don't use this attribute for multiversioning. > > 2. The algorithm used is too aggressive, and will eliminate the > indirection in some cases where the runtime choice of callee version > can't be determined statically at compile time. A correct would need to > verify that: > - if the current caller version were selected at runtime, then the > chosen callee version would be eligible for selection. > - if any higher priority callee version were selected at runtime, then > a higher priority caller version would have been eligble for > selection (and hence the current caller version wouldn't have been > selected). > > The current checks only verify a more restrictive version of the first > condition, and don't check the second condition at all. > > Fixing the optimisation properly would require implementing target hooks > to check for implications between version attributes, which is too > complicated for this stage. However, I would like to see this hook > implemented in the future, since it could also help deduplicate other > multiversioning code.
Yeah, agreed on both counts. And I think we should try to design the hooks to minimise cut-&-paste between targets (possibly for target_version only). > Since this behaviour has existed for x86 and powerpc for a while, I > think it's best to preserve the existing behaviour on those targets, > unless any maintainer for those targets disagrees. > > ---- > > There was a previous patch by Yangyu Chen [1] which addressed flaw 1 but > overlooked flaw 2. I pointed out flaw 2 at the time; as far as I can tell, > there haven't been any further patches proposed for this issue (apologies if I > missed something; I've not checked the mailing lists particularly thoroughly). > > I'm currently bootstrap and regression testing this on aarch64. Assuming thes > tests pass, is this ok for master and backport to gcc-14? > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667527.html > > gcc/ChangeLog: > > * multiple_target.cc > (redirect_to_specific_clone): Assert that "target" attribute is > used for FMV before checking it. > (ipa_target_clone): Skip redirect_to_specific_clone on some > targets. > > gcc/testsuite/ChangeLog: > > * g++.target/aarch64/mv-pragma.C: New test. The patch LGTM apart from nits: > diff --git a/gcc/multiple_target.cc b/gcc/multiple_target.cc > index > 552b9626aa7173546d3b027a2b1a3d227ad25375..d8becf4d9a9626a4ab3688ab30431487a6db770f > 100644 > --- a/gcc/multiple_target.cc > +++ b/gcc/multiple_target.cc > @@ -442,7 +442,14 @@ expand_target_clones (struct cgraph_node *node, bool > definition) > > /* When NODE is a target clone, consider all callees and redirect > to a clone with equal target attributes. That prevents multiple > - multi-versioning dispatches and a call-chain can be optimized. */ > + multi-versioning dispatches and a call-chain can be optimized. > + > + This optimisation might pick the wrong version in some cases, since > knowing > + that we meet the target requirements for a matching callee version does > not > + tell us that we won't also meet the target requirements for a higher > + priority callee version at runtime. Since this is longstanding behaviour "behavior" > + for x86 and powerpc, we preserve it for those targets, but skip the > optimisation > + for targets that use the "target_version" attribute for multi-versioning. > */ two long lines Please give the RISC-V folks a day or so to comment, but otherwise this is OK for trunk and GCC 14 with the comments above fixed. Thanks, Richard > > static void > redirect_to_specific_clone (cgraph_node *node) > @@ -451,6 +458,7 @@ redirect_to_specific_clone (cgraph_node *node) > if (fv == NULL) > return; > > + gcc_assert (TARGET_HAS_FMV_TARGET_ATTRIBUTE); > tree attr_target = lookup_attribute ("target", DECL_ATTRIBUTES > (node->decl)); > if (attr_target == NULL_TREE) > return; > @@ -503,8 +511,9 @@ ipa_target_clone (void) > for (unsigned i = 0; i < to_dispatch.length (); i++) > create_dispatcher_calls (to_dispatch[i]); > > - FOR_EACH_FUNCTION (node) > - redirect_to_specific_clone (node); > + if (TARGET_HAS_FMV_TARGET_ATTRIBUTE) > + FOR_EACH_FUNCTION (node) > + redirect_to_specific_clone (node); > > return 0; > } > diff --git a/gcc/testsuite/g++.target/aarch64/mv-pragma.C > b/gcc/testsuite/g++.target/aarch64/mv-pragma.C > new file mode 100644 > index > 0000000000000000000000000000000000000000..545d0735438d4774a2cb909f28584ddbf4f54d2a > --- /dev/null > +++ b/gcc/testsuite/g++.target/aarch64/mv-pragma.C > @@ -0,0 +1,31 @@ > +/* { dg-do compile } */ > +/* { dg-require-ifunc "" } */ > +/* { dg-options "-O0" } */ > + > +#pragma GCC target ("+sve") > + > +__attribute__((target_version("default"))) > +int foo () > +{ > + return 1; > +} > + > +__attribute__((target_version("sve2"))) > +int foo () > +{ > + return 2; > +} > + > +__attribute__((target_version("default"))) > +int bar () > +{ > + return foo(); > +} > + > +__attribute__((target_version("sha3"))) > +int bar () > +{ > + return foo() + 5; > +} > + > +/* { dg-final { scan-assembler-times "\n\tbl\t_Z3foov\n" 2 } } */