Hi all,

This is an optimisation similar to the one discussed in [1] and posted in [2].

This is slightly stronger as it makes use of the callee version information
*and* caller information, enabling slightly more cases to be covered.

This also means it can replace most of the cases that the previous optimisation
covered, where two version sets implement the same set of versions. Some cases
will be dropped where there are genuinely higher priority versions that could
be selected, but in my opinion that's okay.

This requires my FMV patch series. Mostly due to it relying on the function
versions being sorted by priority order, but it also uses some helper functions.

Any FMV target would benefit from implementing
TARGET_OPTION_VERSION_A_IMPLIES_VERSION_B to enable more redirection cases,
but there is a default implementation which just checks for matching target/
target_version attribute values.

This is reg tested on aarch64 and ix86 linux gnu.
(Notably this includes gcc/testsuite/g++.target/i386/pr82625.C which tests
the previous optimisation).

I've made a forgejo PR here if reviewers want to try that:
https://forge.sourceware.org/alfie.richards/gcc-TEST/pulls/2

[1] https://patchwork.sourceware.org/comment/197172/
[2] https://gcc.gnu.org/pipermail/gcc-patches/2025-April/680876.html

Kind regards,
Alfie Richards

-- >8 --

Adds an optimisation in FMV to redirect to a specific target if possible.

A call is redirected to a specific target if both:
- the caller can always call the callee version
- and, it is possible to rule out all higher priority versions of the callee
  fmv set. That is estabilished either by the callee being the highest priority
  version, or each higher priority version of the callee implying that, were it
  resolved, a higher priority version of the caller would have been selected.

For this logic, introduces the new TARGET_OPTION_VERSION_A_IMPLIES_VERSION_B
hook. Adds a full implementation for Aarch64, and a weaker default version
for other targets.

This allows the target to replace the previous optimisation as the new one is
able to cover the same case where two function sets implement the same versions.

gcc/ChangeLog:

        * config/aarch64/aarch64.cc (aarch64_version_a_implies_version_b): New
        function.
        (TARGET_OPTION_VERSION_A_IMPLIES_VERSION_B): New define.
        * doc/tm.texi: Regenerate.
        * doc/tm.texi.in: Add documentation for version_a_implies_version_b.
        * multiple_target.cc (redirect_to_specific_clone): Add new optimisation
        logic.
        (ipa_target_clone): Add
        * target.def: Remove TARGET_HAS_FMV_TARGET_ATTRIBUTE check.
        * attribs.cc: (version_a_implies_version_b) New function.
        * attribs.h: (version_a_implies_version_b) New function.

gcc/testsuite/ChangeLog:

        * g++.target/aarch64/fmv-selection1.C: New test.
        * g++.target/aarch64/fmv-selection2.C: New test.
        * g++.target/aarch64/fmv-selection3.C: New test.
        * g++.target/aarch64/fmv-selection4.C: New test.
        * g++.target/aarch64/fmv-selection5.C: New test.
        * g++.target/aarch64/fmv-selection6.C: New test.
---
 gcc/attribs.cc                                | 16 ++++
 gcc/attribs.h                                 |  1 +
 gcc/config/aarch64/aarch64.cc                 | 26 +++++
 gcc/doc/tm.texi                               |  4 +
 gcc/doc/tm.texi.in                            |  2 +
 gcc/multiple_target.cc                        | 96 ++++++++++++-------
 gcc/target.def                                |  9 ++
 .../g++.target/aarch64/fmv-selection1.C       | 40 ++++++++
 .../g++.target/aarch64/fmv-selection2.C       | 40 ++++++++
 .../g++.target/aarch64/fmv-selection3.C       | 25 +++++
 .../g++.target/aarch64/fmv-selection4.C       | 30 ++++++
 .../g++.target/aarch64/fmv-selection5.C       | 28 ++++++
 .../g++.target/aarch64/fmv-selection6.C       | 27 ++++++
 13 files changed, 311 insertions(+), 33 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection1.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection2.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection3.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection4.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection5.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection6.C

diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index 2ca82674f7c..66c77904404 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -1095,6 +1095,22 @@ common_function_versions (string_slice fn1 
ATTRIBUTE_UNUSED,
   gcc_unreachable ();
 }
 
+bool
+version_a_implies_version_b (tree fn1, tree fn2)
+{
+  const char *attr_name = TARGET_HAS_FMV_TARGET_ATTRIBUTE
+                         ? "target"
+                         : "target_version";
+
+  tree attr1 = lookup_attribute (attr_name, DECL_ATTRIBUTES (fn1));
+  tree attr2 = lookup_attribute (attr_name, DECL_ATTRIBUTES (fn2));
+
+  if (!attr1 || !attr2)
+    return false;
+
+  return attribute_value_equal (attr1, attr2);
+}
+
 /* Comparator function to be used in qsort routine to sort attribute
    specification strings to "target".  */
 
diff --git a/gcc/attribs.h b/gcc/attribs.h
index fc343c0eab5..b846ce0d3a2 100644
--- a/gcc/attribs.h
+++ b/gcc/attribs.h
@@ -58,6 +58,7 @@ extern bool common_function_versions (string_slice, 
string_slice);
 extern bool reject_target_clone_version (string_slice, location_t);
 extern tree make_dispatcher_decl (const tree);
 extern bool is_function_default_version (const tree);
+extern bool version_a_implies_version_b (tree, tree);
 extern void handle_ignored_attributes_option (vec<char *> *);
 
 /* Return a type like TTYPE except that its TYPE_ATTRIBUTES
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index d48abb8f800..833276820e3 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -20260,6 +20260,29 @@ aarch64_compare_version_priority (tree decl1, tree 
decl2)
   return compare_feature_masks (mask1, mask2);
 }
 
+/* Check if version a implies version b.  */
+bool
+aarch64_version_a_implies_version_b (tree decl_a, tree decl_b)
+{
+  auto a_isa = aarch64_get_isa_flags
+                (TREE_TARGET_OPTION (aarch64_fndecl_options (decl_a)));
+  auto b_isa = aarch64_get_isa_flags
+                (TREE_TARGET_OPTION (aarch64_fndecl_options (decl_b)));
+
+  auto a_version = get_target_version (decl_a);
+  auto b_version = get_target_version (decl_b);
+  if (a_version.is_valid ())
+    aarch64_parse_fmv_features (a_version, &a_isa, NULL, NULL);
+  if (b_version.is_valid ())
+    aarch64_parse_fmv_features (b_version, &b_isa, NULL, NULL);
+
+  /* Are there any bits of b that arent in a.  */
+  if (b_isa & (~a_isa))
+    return false;
+
+  return true;
+}
+
 /* Build the struct __ifunc_arg_t type:
 
    struct __ifunc_arg_t
@@ -32059,6 +32082,9 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_COMPARE_VERSION_PRIORITY
 #define TARGET_COMPARE_VERSION_PRIORITY aarch64_compare_version_priority
 
+#undef TARGET_OPTION_VERSION_A_IMPLIES_VERSION_B
+#define TARGET_OPTION_VERSION_A_IMPLIES_VERSION_B 
aarch64_version_a_implies_version_b
+
 #undef TARGET_GENERATE_VERSION_DISPATCHER_BODY
 #define TARGET_GENERATE_VERSION_DISPATCHER_BODY \
   aarch64_generate_version_dispatcher_body
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index a6e9b2fcc0f..0ae02296199 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -10967,6 +10967,10 @@ versions if and only if they imply different target 
specific attributes,
 that is, they are compiled for different target machines.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_OPTION_VERSION_A_IMPLIES_VERSION_B (tree 
@var{v_a}, tree @var{v_b})
+This target hook returns @code{true} if the target implied by @var{v_a} (with 
the globally enabled extensions) is a super-set of the features required for 
@var{v_b}.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_CAN_INLINE_P (tree @var{caller}, tree 
@var{callee})
 This target hook returns @code{false} if the @var{caller} function
 cannot inline @var{callee}, based on target specific information.  By
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 5db7917e214..4d0024d142b 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -7130,6 +7130,8 @@ with the target specific attributes.  The default value 
is @code{','}.
 
 @hook TARGET_OPTION_FUNCTION_VERSIONS
 
+@hook TARGET_OPTION_VERSION_A_IMPLIES_VERSION_B
+
 @hook TARGET_CAN_INLINE_P
 
 @hook TARGET_UPDATE_IPA_FN_TARGET_INFO
diff --git a/gcc/multiple_target.cc b/gcc/multiple_target.cc
index 7784478d8e2..093fbb7e004 100644
--- a/gcc/multiple_target.cc
+++ b/gcc/multiple_target.cc
@@ -437,47 +437,78 @@ expand_target_clones (struct cgraph_node *node, bool 
definition)
 static void
 redirect_to_specific_clone (cgraph_node *node)
 {
-  cgraph_function_version_info *fv = node->function_version ();
-  if (fv == NULL)
-    return;
-
-  gcc_assert (TARGET_HAS_FMV_TARGET_ATTRIBUTE);
-  tree attr_target = lookup_attribute ("target", DECL_ATTRIBUTES (node->decl));
-  if (attr_target == NULL_TREE)
+  if (!targetm.compare_version_priority
+      || !targetm.target_option.version_a_implies_version_b
+      || !optimize)
     return;
 
   /* We need to remember NEXT_CALLER as it could be modified in the loop.  */
   for (cgraph_edge *e = node->callees; e ; e = e->next_callee)
     {
-      cgraph_function_version_info *fv2 = e->callee->function_version ();
-      if (!fv2)
+      /* Only if this is a call to a dispatched symbol.  */
+      if (!e->callee->dispatcher_function)
        continue;
 
-      tree attr_target2 = lookup_attribute ("target",
-                                           DECL_ATTRIBUTES (e->callee->decl));
+      cgraph_function_version_info *callee_v
+       = e->callee->function_version ();
+      cgraph_function_version_info *caller_v
+       = e->caller->function_version ();
 
-      /* Function is not calling proper target clone.  */
-      if (attr_target2 == NULL_TREE
-         || !attribute_value_equal (attr_target, attr_target2))
-       {
-         while (fv2->prev != NULL)
-           fv2 = fv2->prev;
+      /* If this is not the TU that contains the definition of the default
+        version we are not guaranteed to have visibility of all versions
+        so cannot reason about them.  */
+      if (!TREE_STATIC (callee_v->next->this_node->decl))
+       continue;
 
-         /* Try to find a clone with equal target attribute.  */
-         for (; fv2 != NULL; fv2 = fv2->next)
-           {
-             cgraph_node *callee = fv2->this_node;
-             attr_target2 = lookup_attribute ("target",
-                                              DECL_ATTRIBUTES (callee->decl));
-             if (attr_target2 != NULL_TREE
-                 && attribute_value_equal (attr_target, attr_target2))
+      cgraph_function_version_info *highest_callable_fn = NULL;
+      for (cgraph_function_version_info *ver = callee_v->next;
+          ver;
+          ver = ver->next)
+       if (targetm.target_option.version_a_implies_version_b
+             (node->decl, ver->this_node->decl))
+         highest_callable_fn = ver;
+
+      if (!highest_callable_fn)
+       continue;
+
+      /* If there are higher priority versions of callee and caller has no
+        more version information, then not callable.  */
+      if (!caller_v && highest_callable_fn->next)
+       continue;
+
+      bool inlinable = true;
+      /* If every higher priority version would imply a higher priority
+        version of caller would have been selected, then this is
+        callable.  */
+      if (caller_v)
+       for (cgraph_function_version_info *callee_ver
+              = highest_callable_fn->next;
+            callee_ver;
+            callee_ver = callee_ver->next)
+         {
+           bool not_possible = false;
+           for (cgraph_function_version_info *caller_ver = caller_v->next;
+                caller_ver;
+                caller_ver = caller_ver->next)
+             if (targetm.target_option.version_a_implies_version_b
+                   (callee_ver->this_node->decl,
+                    caller_ver->this_node->decl))
                {
-                 e->redirect_callee (callee);
-                 cgraph_edge::redirect_call_stmt_to_callee (e);
+                 not_possible = true;
                  break;
                }
-           }
-       }
+           if (!not_possible)
+             {
+               inlinable = false;
+               break;
+             }
+         }
+      if (!inlinable)
+       continue;
+
+      e->redirect_callee (highest_callable_fn->this_node);
+      cgraph_edge::redirect_call_stmt_to_callee (e);
     }
 }
 
@@ -555,9 +586,8 @@ ipa_target_clone (bool early)
   for (unsigned i = 0; i < to_dispatch.length (); i++)
     create_dispatcher_calls (to_dispatch[i]);
 
-  if (TARGET_HAS_FMV_TARGET_ATTRIBUTE)
-    FOR_EACH_FUNCTION (node)
-      redirect_to_specific_clone (node);
+  FOR_EACH_FUNCTION (node)
+    redirect_to_specific_clone (node);
 
   return 0;
 }
diff --git a/gcc/target.def b/gcc/target.def
index 28b5f95434e..07e04403d0b 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -6919,6 +6919,15 @@ that is, they are compiled for different target 
machines.",
  bool, (string_slice fn1, string_slice fn2),
  hook_stringslice_stringslice_unreachable)
 
+/* Comment.  */
+DEFHOOK
+(version_a_implies_version_b,
+ "This target hook returns @code{true} if the target implied by @var{v_a}\
+ (with the globally enabled extensions) is a super-set\
+ of the features required for @var{v_b}.",
+ bool, (tree v_a, tree v_b),
+ version_a_implies_version_b)
+
 /* Function to determine if one function can inline another function.  */
 #undef HOOK_PREFIX
 #define HOOK_PREFIX "TARGET_"
diff --git a/gcc/testsuite/g++.target/aarch64/fmv-selection1.C 
b/gcc/testsuite/g++.target/aarch64/fmv-selection1.C
new file mode 100644
index 00000000000..4ee54466c13
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/fmv-selection1.C
@@ -0,0 +1,40 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O2 -march=armv8-a" } */
+
+__attribute__((target_version("default")))
+__attribute__((optimize("O0")))
+int foo ()
+{
+  return 1;
+}
+
+__attribute__((target_version("rng")))
+__attribute__((optimize("O0")))
+int foo ()
+{
+  return 2;
+}
+
+__attribute__((target_version("flagm")))
+__attribute__((optimize("O0")))
+int foo ()
+{
+  return 3;
+}
+
+__attribute__((target_version("rng+flagm")))
+__attribute__((optimize("O0")))
+int foo ()
+{
+  return 4;
+}
+
+int bar()
+{
+  return foo ();
+}
+
+/* Cannot optimize */
+/* { dg-final { scan-assembler-times "\n\tb\t_Z3foov\n" 1 } } */
+
diff --git a/gcc/testsuite/g++.target/aarch64/fmv-selection2.C 
b/gcc/testsuite/g++.target/aarch64/fmv-selection2.C
new file mode 100644
index 00000000000..f580dac4458
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/fmv-selection2.C
@@ -0,0 +1,40 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O2 -march=armv8-a+rng+flagm" } */
+
+__attribute__((target_version("default")))
+__attribute__((optimize("O0")))
+int foo ()
+{
+  return 1;
+}
+
+__attribute__((target_version("rng")))
+__attribute__((optimize("O0")))
+int foo ()
+{
+  return 2;
+}
+
+__attribute__((target_version("flagm")))
+__attribute__((optimize("O0")))
+int foo ()
+{
+  return 3;
+}
+
+__attribute__((target_version("rng+flagm")))
+__attribute__((optimize("O0")))
+int foo ()
+{
+  return 4;
+}
+
+int bar()
+{
+  return foo ();
+}
+
+/* Can optimize to highest priority function */
+/* { dg-final { scan-assembler-times "\n\tb\t_Z3foov\._MrngMflagm\n" 1 } } */
+
diff --git a/gcc/testsuite/g++.target/aarch64/fmv-selection3.C 
b/gcc/testsuite/g++.target/aarch64/fmv-selection3.C
new file mode 100644
index 00000000000..6b52fd4f644
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/fmv-selection3.C
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O2 -march=armv8-a" } */
+
+__attribute__((target_version("default")))
+__attribute__((optimize("O0")))
+int foo ()
+{ return 1; }
+
+__attribute__((target_version("rng")))
+int foo ();
+__attribute__((target_version("flagm")))
+int foo ();
+__attribute__((target_version("rng+flagm")))
+int foo ();
+
+__attribute__((target_version("rng+flagm")))
+int bar()
+{
+  return foo ();
+}
+
+/* Cannot optimize */
+/* { dg-final { scan-assembler-times "\n\tb\t_Z3foov\._MrngMflagm\n" 1 } } */
+
diff --git a/gcc/testsuite/g++.target/aarch64/fmv-selection4.C 
b/gcc/testsuite/g++.target/aarch64/fmv-selection4.C
new file mode 100644
index 00000000000..155145dcd88
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/fmv-selection4.C
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O2 -march=armv8-a" } */
+
+__attribute__((target_version("default")))
+__attribute__((optimize("O0")))
+int foo ()
+{ return 1; }
+
+__attribute__((target_version("rng")))
+int foo ();
+__attribute__((target_version("flagm")))
+int foo ();
+__attribute__((target_version("rng+flagm")))
+int foo ();
+
+__attribute__((target_version("default")))
+int bar()
+{
+  return foo ();
+}
+
+__attribute__((target_version("rng")))
+int bar();
+
+__attribute__((target_version("flagm")))
+int bar();
+
+/* { dg-final { scan-assembler-times "\n\tb\t_Z3foov\.default\n" 1 } } */
+
diff --git a/gcc/testsuite/g++.target/aarch64/fmv-selection5.C 
b/gcc/testsuite/g++.target/aarch64/fmv-selection5.C
new file mode 100644
index 00000000000..4d6d38e3754
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/fmv-selection5.C
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O2 -march=armv8-a" } */
+
+__attribute__((target_version("default")))
+__attribute__((optimize("O0")))
+int foo ()
+{ return 1; }
+
+__attribute__((target_version("rng")))
+int foo ();
+__attribute__((target_version("flagm")))
+int foo ();
+__attribute__((target_version("rng+flagm")))
+int foo ();
+
+__attribute__((target_version("default")))
+int bar()
+{
+  return foo ();
+}
+
+__attribute__((target_version("flagm")))
+int bar();
+
+/* { dg-final { scan-assembler-times "\n\tb\t_Z3foov\.default\n" 0 } } */
+/* { dg-final { scan-assembler-times "\n\tb\t_Z3foov\n" 1 } } */
+
diff --git a/gcc/testsuite/g++.target/aarch64/fmv-selection6.C 
b/gcc/testsuite/g++.target/aarch64/fmv-selection6.C
new file mode 100644
index 00000000000..db384e16c09
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/fmv-selection6.C
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O2 -march=armv8-a+rng" } */
+
+__attribute__((target_version("default")))
+__attribute__((optimize("O0")))
+int foo ()
+{ return 1; }
+
+__attribute__((target_version("rng")))
+int foo ();
+__attribute__((target_version("flagm")))
+int foo ();
+__attribute__((target_version("rng+flagm")))
+int foo ();
+
+__attribute__((target_version("default")))
+int bar()
+{
+  return foo ();
+}
+
+__attribute__((target_version("flagm")))
+int bar();
+
+/* { dg-final { scan-assembler-times "\n\tb\t_Z3foov\._Mrng\n" 1 } } */
+
-- 
2.34.1

Reply via email to