I recently found that target_clones functions cannot inline even when the caller has exactly the same target. However, if we only use target attributes in C++ and let the compiler generate IFUNC for us, the functions with the same target will be inlined.
For example, the following code compiled on x86-64 target with -O3 will generate IFUNC for foo and bar and inline foo into the bar: ```cpp __attribute__((target("default"))) int foo(int *arr) { int sum = 0; for (int i=0;i<16;i++) sum += arr[i]; return sum; } __attribute__((target("avx2"))) int foo(int *arr) { int sum = 0; for (int i=0;i<16;i++) sum += arr[i]; return sum; } __attribute__((target("default"))) int bar(int *arr) { return foo(arr); } __attribute__((target("avx2"))) int bar(int *arr) { return foo(arr); } ``` However, if we use target_clones attribute, the target_clones functions will not be inlined: ```cpp __attribute__((target_clones("default","avx2"))) int foo(int *arr) { int sum = 0; for (int i=0;i<16;i++) sum += arr[i]; return sum; } __attribute__((target_clones("default","avx2"))) int bar(int *arr) { return foo(arr); } ``` This behavior may negatively impact performance since the target_clones functions are not inlined. And since we didn't jump to the target_clones functions based on PLT but used the same target as the caller's target. I think it's better to allow the target_clones functions to be inlined. gcc/ada/ChangeLog: * gcc-interface/utils.cc (handle_target_clones_attribute): Allow functions with target_clones attribute to be inlined. gcc/c-family/ChangeLog: * c-attribs.cc (handle_target_clones_attribute): Allow functions with target_clones attribute to be inlined. gcc/d/ChangeLog: * d-attribs.cc (d_handle_target_clones_attribute): Allow functions with target_clones attribute to be inlined. Signed-off-by: Yangyu Chen <chenyan...@isrc.iscas.ac.cn> --- gcc/ada/gcc-interface/utils.cc | 5 +---- gcc/c-family/c-attribs.cc | 3 --- gcc/d/d-attribs.cc | 5 ----- 3 files changed, 1 insertion(+), 12 deletions(-) diff --git a/gcc/ada/gcc-interface/utils.cc b/gcc/ada/gcc-interface/utils.cc index 60f36b1e50d..d010b684177 100644 --- a/gcc/ada/gcc-interface/utils.cc +++ b/gcc/ada/gcc-interface/utils.cc @@ -7299,10 +7299,7 @@ handle_target_clones_attribute (tree *node, tree name, tree ARG_UNUSED (args), int ARG_UNUSED (flags), bool *no_add_attrs) { /* Ensure we have a function type. */ - if (TREE_CODE (*node) == FUNCTION_DECL) - /* Do not inline functions with multiple clone targets. */ - DECL_UNINLINABLE (*node) = 1; - else + if (TREE_CODE (*node) != FUNCTION_DECL) { warning (OPT_Wattributes, "%qE attribute ignored", name); *no_add_attrs = true; diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc index 4dd2eecbea5..f8759bb1908 100644 --- a/gcc/c-family/c-attribs.cc +++ b/gcc/c-family/c-attribs.cc @@ -6105,9 +6105,6 @@ handle_target_clones_attribute (tree *node, tree name, tree ARG_UNUSED (args), "single %<target_clones%> attribute is ignored"); *no_add_attrs = true; } - else - /* Do not inline functions with multiple clone targets. */ - DECL_UNINLINABLE (*node) = 1; } else { diff --git a/gcc/d/d-attribs.cc b/gcc/d/d-attribs.cc index 0f7ca10e017..9f67415adb1 100644 --- a/gcc/d/d-attribs.cc +++ b/gcc/d/d-attribs.cc @@ -788,11 +788,6 @@ d_handle_target_clones_attribute (tree *node, tree name, tree, int, warning (OPT_Wattributes, "%qE attribute ignored", name); *no_add_attrs = true; } - else - { - /* Do not inline functions with multiple clone targets. */ - DECL_UNINLINABLE (*node) = 1; - } return NULL_TREE; } -- 2.45.2