On Sun, Oct 27, 2024 at 04:00:43PM +0000, Yangyu Chen wrote: > Following the implementation of commit b8ce8129a5 ("Redirect call > within specific target attribute among MV clones (PR ipa/82625)"), > we can now optimize calls by invoking a versioned function callee > from a caller that shares the same target attribute. However, on > targets that define TARGET_HAS_FMV_TARGET_ATTRIBUTE to zero, meaning > they use the "target_versions" attribute instead of "target", this > optimization is not feasible. Currently, the only target affected > by this limitation is AArch64.
The existing optimization can pick the wrong version in some cases, and fixing this properly requires better comparisons than just a simple string comparison. I'd prefer to just disable this optimization for aarch64 and riscv for now (and backport that fix to gcc-14), and add the necessary target hooks to be able to implement it properly at a later date. (The existing bug applies if you specify both target and target_version/target_clones attributes on the same function, which is an unlikely combination but one that we deliberately chose to support in aarch64). To give a specific example, suppose we have target features featv3, featv2 and featv1, with featv3 implying featv2 implying featv1. Suppose we have the following function versions: Caller: featv2, featv1, default Callee: featv3, featv2, featv1, default In the featv1 and default versions of the caller, we know that we would always select the corresponding version of the callee function, so the redirection is valid there. However, in the featv2 version of the caller, we don't know whether we would select the featv2 or the featv3 versions of the callee at runtime, so we cannot eliminate the runtime indirection. Implementing this correctly in full generality would require the addition of target hooks that indicate whether one target version string is always implied by another (e.g. in the above example, whenever featv3 support is detected we know that we would also detect featv2 support). > This commit resolves the issue by not directly using "target" with > lookup_attribute. Instead, it checks the TARGET_HAS_FMV_TARGET_ATTRIBUTE > macro to decide between using the "target" or "target_version" > attribute. > > Fixes: 79891c4cb5 ("Add support for target_version attribute") > > gcc/ChangeLog: > > * multiple_target.cc (redirect_to_specific_clone): Fix the redirection > does not work on target without TARGET_HAS_FMV_TARGET_ATTRIBUTE. > > gcc/testsuite/ChangeLog: > > * g++.target/aarch64/mvc-redirect.C: New test. > --- > gcc/multiple_target.cc | 8 +++--- > .../g++.target/aarch64/mvc-redirect.C | 25 +++++++++++++++++++ > 2 files changed, 30 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/g++.target/aarch64/mvc-redirect.C > > diff --git a/gcc/multiple_target.cc b/gcc/multiple_target.cc > index d2c9671fc1b..a1c18f4a3a7 100644 > --- a/gcc/multiple_target.cc > +++ b/gcc/multiple_target.cc > @@ -446,8 +446,10 @@ redirect_to_specific_clone (cgraph_node *node) > cgraph_function_version_info *fv = node->function_version (); > if (fv == NULL) > return; > + const char *fmv_attr = (TARGET_HAS_FMV_TARGET_ATTRIBUTE > + ? "target" : "target_version"); > > - tree attr_target = lookup_attribute ("target", DECL_ATTRIBUTES > (node->decl)); > + tree attr_target = lookup_attribute (fmv_attr, DECL_ATTRIBUTES > (node->decl)); > if (attr_target == NULL_TREE) > return; > > @@ -458,7 +460,7 @@ redirect_to_specific_clone (cgraph_node *node) > if (!fv2) > continue; > > - tree attr_target2 = lookup_attribute ("target", > + tree attr_target2 = lookup_attribute (fmv_attr, > DECL_ATTRIBUTES (e->callee->decl)); > > /* Function is not calling proper target clone. */ > @@ -472,7 +474,7 @@ redirect_to_specific_clone (cgraph_node *node) > for (; fv2 != NULL; fv2 = fv2->next) > { > cgraph_node *callee = fv2->this_node; > - attr_target2 = lookup_attribute ("target", > + attr_target2 = lookup_attribute (fmv_attr, > DECL_ATTRIBUTES (callee->decl)); > if (attr_target2 != NULL_TREE > && attribute_value_equal (attr_target, attr_target2)) > diff --git a/gcc/testsuite/g++.target/aarch64/mvc-redirect.C > b/gcc/testsuite/g++.target/aarch64/mvc-redirect.C > new file mode 100644 > index 00000000000..f29cc3745a3 > --- /dev/null > +++ b/gcc/testsuite/g++.target/aarch64/mvc-redirect.C > @@ -0,0 +1,25 @@ > +/* { dg-do compile } */ > +/* { dg-require-ifunc "" } */ > +/* { dg-options "-O0" } */ > + > +__attribute__((target_clones("default", "dotprod", "sve+sve2"))) > +int foo () > +{ > + return 1; > +} > + > +__attribute__((target_clones("default", "dotprod", "sve+sve2"))) > +int bar() > +{ > + return foo (); > +} > + > +/* { dg-final { scan-assembler-times "\n_Z3foov\.default:\n" 1 } } */ > +/* { dg-final { scan-assembler-times "\n_Z3foov\._Mdotprod:\n" 1 } } */ > +/* { dg-final { scan-assembler-times "\n_Z3foov\._MsveMsve2:\n" 1 } } */ > +/* { dg-final { scan-assembler-times "\n_Z3foov\.resolver:\n" 1 } } */ > +/* { dg-final { scan-assembler-times "\n\tbl\t_Z3foov.default\n" 1 } } */ > +/* { dg-final { scan-assembler-times "\n\tbl\t_Z3foov._Mdotprod\n" 1 } } */ > +/* { dg-final { scan-assembler-times "\n\tbl\t_Z3foov._MsveMsve2\n" 1 } } */ > +/* { dg-final { scan-assembler-times "\n\t\.type\t_Z3foov, > %gnu_indirect_function\n" 1 } } */ > +/* { dg-final { scan-assembler-times > "\n\t\.set\t_Z3foov,_Z3foov\.resolver\n" 1 } } */ > -- > 2.47.0 >