The 08/04/2025 11:29, Richard Sandiford wrote: > Alfie Richards <alfie.richa...@arm.com> writes: > > Adds an optimisation in FMV to redirect to a specific target if possible. > > > > A call is redirected to a specific target if both: > > - the caller can always call the callee version > > - and, it is possible to rule out all higher priority versions of the callee > > fmv set. That is estabilished either by the callee being the highest > > priority > > version, or each higher priority version of the callee implying that, > > were it > > resolved, a higher priority version of the caller would have been > > selected. > > > > For this logic, introduces the new TARGET_OPTION_VERSION_A_IMPLIES_VERSION_B > > hook. Adds a full implementation for Aarch64, and a weaker default version > > for other targets. > > > > This allows the target to replace the previous optimisation as the new one > > is > > able to cover the same case where two function sets implement the same > > versions. > > > > gcc/ChangeLog: > > > > * config/aarch64/aarch64.cc (aarch64_version_a_implies_version_b): New > > function. > > (TARGET_OPTION_VERSION_A_IMPLIES_VERSION_B): New define. > > * doc/tm.texi: Regenerate. > > * doc/tm.texi.in: Add documentation for version_a_implies_version_b. > > * multiple_target.cc (redirect_to_specific_clone): Add new optimisation > > logic. > > (ipa_target_clone): Add > > * target.def: Remove TARGET_HAS_FMV_TARGET_ATTRIBUTE check. > > * attribs.cc: (version_a_implies_version_b) New function. > > * attribs.h: (version_a_implies_version_b) New function. > > > > gcc/testsuite/ChangeLog: > > > > * g++.target/aarch64/fmv-selection1.C: New test. > > * g++.target/aarch64/fmv-selection2.C: New test. > > * g++.target/aarch64/fmv-selection3.C: New test. > > * g++.target/aarch64/fmv-selection4.C: New test. > > * g++.target/aarch64/fmv-selection5.C: New test. > > * g++.target/aarch64/fmv-selection6.C: New test. > > --- > > gcc/attribs.cc | 16 ++++ > > gcc/attribs.h | 1 + > > gcc/config/aarch64/aarch64.cc | 26 +++++ > > gcc/doc/tm.texi | 4 + > > gcc/doc/tm.texi.in | 2 + > > gcc/multiple_target.cc | 96 ++++++++++++------- > > gcc/target.def | 9 ++ > > .../g++.target/aarch64/fmv-selection1.C | 40 ++++++++ > > .../g++.target/aarch64/fmv-selection2.C | 40 ++++++++ > > .../g++.target/aarch64/fmv-selection3.C | 25 +++++ > > .../g++.target/aarch64/fmv-selection4.C | 30 ++++++ > > .../g++.target/aarch64/fmv-selection5.C | 28 ++++++ > > .../g++.target/aarch64/fmv-selection6.C | 27 ++++++ > > 13 files changed, 311 insertions(+), 33 deletions(-) > > create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection1.C > > create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection2.C > > create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection3.C > > create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection4.C > > create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection5.C > > create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection6.C > > > > +bool > > +aarch64_version_a_implies_version_b (tree decl_a, tree decl_b) > > +{ > > + auto a_isa = aarch64_get_isa_flags > > + (TREE_TARGET_OPTION (aarch64_fndecl_options (decl_a))); > > + auto b_isa = aarch64_get_isa_flags > > + (TREE_TARGET_OPTION (aarch64_fndecl_options (decl_b))); > > + > > + auto a_version = get_target_version (decl_a); > > + auto b_version = get_target_version (decl_b); > > + if (a_version.is_valid ()) > > + aarch64_parse_fmv_features (a_version, &a_isa, NULL, NULL); > > + if (b_version.is_valid ()) > > + aarch64_parse_fmv_features (b_version, &b_isa, NULL, NULL); > > The name made me think that we should be able to assert that is_valid, > but I suppose the usage in the patch means that this isn't true for "a" > at least. > > But I think the hook documentation needs to be more precise about what > this means in terms of the baseline ISA and the target(_version) ISA. > It looks on the face of it that, if we had: > > a: baseline armv8-a, target_version("default") > b: target("sve"), target_version("default") > > then the default implementation would return true, on the basis that > the target_versions are the same, and that the aarch64 hook would > return false. That makes it seem like the default implementation isn't > conservatively correct (or, alternatively, that the aarch64 code is > pessimising valid cases that the default implementation wouldn't). > > It would be good to test some target+target_version cases. It looks > like the tests in the patch are for pure target_version only.
Thank you again for the reviews! Ah I hadn't thought about that at all. I will need to consider in depth. I guess the only thing to assume in those situations is that the user has used target correctly, and so we should assume those features are correctly enabled for calls from functions annotated with target. For that I will update and add tests. For functions annotated with target+target_version the sematics are a lot less clear to me. Is it clear to you? My inital thought is we should disallow target+target_version/target_clones? > > > > - /* Function is not calling proper target clone. */ > > - if (attr_target2 == NULL_TREE > > - || !attribute_value_equal (attr_target, attr_target2)) > > - { > > - while (fv2->prev != NULL) > > - fv2 = fv2->prev; > > + /* If this is not the TU that contains the definition of the default > > + version we are not guaranteed to have visibility of all versions > > + so cannot reason about them. */ > > + if (!TREE_STATIC (callee_v->next->this_node->decl)) > > + continue; > > I think the important test here is instead e->binds_to_current_def_p (). > Preemption and weak symbols mean that a locally-defined function might > not be the one chosen at link or load time. Ah thank you so much! > > Thanks, > Richard The other parts you messaged about I will fix in V8. Thank you again! -- Alfie Richards