I recently found that target_clones functions cannot inline even when
the caller has exactly the same target. However, if we only use target
attributes in C++ and let the compiler generate IFUNC for us, the
functions with the same target will be inlined.

For example, the following code compiled on x86-64 target with -O3 will
generate IFUNC for foo and bar and inline foo into the bar:

```cpp
__attribute__((target("default")))
int foo(int *arr) {
    int sum = 0;
    for (int i=0;i<16;i++) sum += arr[i];
    return sum;
}

__attribute__((target("avx2")))
int foo(int *arr) {
    int sum = 0;
    for (int i=0;i<16;i++) sum += arr[i];
    return sum;
}

__attribute__((target("default")))
int bar(int *arr) {
    return foo(arr);
}

__attribute__((target("avx2")))
int bar(int *arr) {
    return foo(arr);
}
```

However, if we use target_clones attribute, the target_clones functions
will not be inlined:

```cpp
__attribute__((target_clones("default","avx2")))
int foo(int *arr) {
    int sum = 0;
    for (int i=0;i<16;i++) sum += arr[i];
    return sum;
}

__attribute__((target_clones("default","avx2")))
int bar(int *arr) {
    return foo(arr);
}
```

This behavior may negatively impact performance since the target_clones
functions are not inlined. And since we didn't jump to the target_clones
functions based on PLT but used the same target as the caller's target.
I think it's better to allow the target_clones functions to be inlined.

gcc/ada/ChangeLog:

        * gcc-interface/utils.cc (handle_target_clones_attribute):
        Allow functions with target_clones attribute to be inlined.

gcc/c-family/ChangeLog:

        * c-attribs.cc (handle_target_clones_attribute):
        Allow functions with target_clones attribute to be inlined.

gcc/d/ChangeLog:

        * d-attribs.cc (d_handle_target_clones_attribute):
        Allow functions with target_clones attribute to be inlined.

Signed-off-by: Yangyu Chen <chenyan...@isrc.iscas.ac.cn>
---
 gcc/ada/gcc-interface/utils.cc | 5 +----
 gcc/c-family/c-attribs.cc      | 3 ---
 gcc/d/d-attribs.cc             | 5 -----
 3 files changed, 1 insertion(+), 12 deletions(-)

diff --git a/gcc/ada/gcc-interface/utils.cc b/gcc/ada/gcc-interface/utils.cc
index 60f36b1e50d..d010b684177 100644
--- a/gcc/ada/gcc-interface/utils.cc
+++ b/gcc/ada/gcc-interface/utils.cc
@@ -7299,10 +7299,7 @@ handle_target_clones_attribute (tree *node, tree name, 
tree ARG_UNUSED (args),
                          int ARG_UNUSED (flags), bool *no_add_attrs)
 {
   /* Ensure we have a function type.  */
-  if (TREE_CODE (*node) == FUNCTION_DECL)
-    /* Do not inline functions with multiple clone targets.  */
-    DECL_UNINLINABLE (*node) = 1;
-  else
+  if (TREE_CODE (*node) != FUNCTION_DECL)
     {
       warning (OPT_Wattributes, "%qE attribute ignored", name);
       *no_add_attrs = true;
diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 4dd2eecbea5..f8759bb1908 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -6105,9 +6105,6 @@ handle_target_clones_attribute (tree *node, tree name, 
tree ARG_UNUSED (args),
                   "single %<target_clones%> attribute is ignored");
          *no_add_attrs = true;
        }
-      else
-      /* Do not inline functions with multiple clone targets.  */
-       DECL_UNINLINABLE (*node) = 1;
     }
   else
     {
diff --git a/gcc/d/d-attribs.cc b/gcc/d/d-attribs.cc
index 0f7ca10e017..9f67415adb1 100644
--- a/gcc/d/d-attribs.cc
+++ b/gcc/d/d-attribs.cc
@@ -788,11 +788,6 @@ d_handle_target_clones_attribute (tree *node, tree name, 
tree, int,
       warning (OPT_Wattributes, "%qE attribute ignored", name);
       *no_add_attrs = true;
     }
-  else
-    {
-      /* Do not inline functions with multiple clone targets.  */
-      DECL_UNINLINABLE (*node) = 1;
-    }
 
   return NULL_TREE;
 }
-- 
2.45.2

Reply via email to