[COMMITED] MAINTAINERS: Add myself to write after approval
ChangeLog: * MAINTAINERS: Add myself to write after approval Signed-off-by: Josef Melcr --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 4a53521d8eb..f94aa9aeb79 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -657,6 +657,7 @@ Bryce McKinlay bryce Adam Megacz - Bingfeng Meimeibf Michael Meissnermeissner +Josef Melcr - Jason Merrill jason Jim Meyering- Martin Michlmayrtbm -- 2.47.0
Re: [PATCH] cgraph: remove dead if stmt in build_cgraph_edges pass
So I experimented a little and ran the testsuite a few times. While both if statements seem to be dead, the assertion gcc_checking_assert (!is_gimple_omp (stmt)) doesn't actually hold, as adding this assert breaks around 40 omp/oacc tests, so some other statements are definitely slipping through. I would need to dig deeper to see which statements those are. I am not quite sure where that leaves this patch. Best regards, Josef Melcr Dne 24. 10. 24 v 19:45 Josef Melcr napsal(a): Capital Remove The second line should be just tab indented, not tab + 2 spaces, and finished with dot. gomp_parallel rather than gomp-parallel. Sorry about the formatting issues, I didn't notice them. The if (gimple_code (stmt) == GIMPLE_OMP_TASK) case should go as well. Wonder if gcc_checking_assert (!is_gimple_omp (stmt)); wouldn't be useful as replacement at least for some time to verify it isn't needed (but if it would be needed, we miss various other GIMPLE_OMP_* statements). The GIMPLE_OMP_TASK case went under my radar, since I was primarily focused on gomp_parallel statements. I'll try these changes and retest. Best regards, Josef Melcr Dne 24. 10. 24 v 18:49 Jakub Jelinek napsal(a): On Thu, Oct 24, 2024 at 04:37:24PM +0200, Josef Melcr wrote: This patch removes a dead if statement checking for gomp-parallel gimple statements. This if is in the execute method of build_cgraph_edges pass, which is executed right after the omp_expand pass, which removes these gimple statements and replaces them with simple gcalls, making this if practically dead. Some TSan tests are failing with this patch, but I don't think this change is likely to cause these failures. Additionally, the failures are not consistent across runs, making me think these failures are a bug in TSan itself. All other tests are ok. Tested on x86_64-pc-linux-gnu. OK for master ? gcc/ChangeLog: * cgraphbuild.cc (pass_build_cgraph_edges::execute): remove if Capital Remove statement checking for gomp-parallel statements The second line should be just tab indented, not tab + 2 spaces, and finished with dot. gomp_parallel rather than gomp-parallel. --- a/gcc/cgraphbuild.cc +++ b/gcc/cgraphbuild.cc @@ -340,12 +340,6 @@ pass_build_cgraph_edges::execute (function *fun) bb->count); } node->record_stmt_references (stmt); - if (gomp_parallel *omp_par_stmt = dyn_cast (stmt)) - { - tree fn = gimple_omp_parallel_child_fn (omp_par_stmt); - node->create_reference (cgraph_node::get_create (fn), - IPA_REF_ADDR, stmt); - } if (gimple_code (stmt) == GIMPLE_OMP_TASK) { tree fn = gimple_omp_task_child_fn (stmt); The if (gimple_code (stmt) == GIMPLE_OMP_TASK) case should go as well. Wonder if gcc_checking_assert (!is_gimple_omp (stmt)); wouldn't be useful as replacement at least for some time to verify it isn't needed (but if it would be needed, we miss various other GIMPLE_OMP_* statements). Jakub smime.p7s Description: Elektronicky podpis S/MIME
Re: [PATCH] cgraph: remove dead if stmt in build_cgraph_edges pass
Capital Remove The second line should be just tab indented, not tab + 2 spaces, and finished with dot. gomp_parallel rather than gomp-parallel. Sorry about the formatting issues, I didn't notice them. The if (gimple_code (stmt) == GIMPLE_OMP_TASK) case should go as well. Wonder if gcc_checking_assert (!is_gimple_omp (stmt)); wouldn't be useful as replacement at least for some time to verify it isn't needed (but if it would be needed, we miss various other GIMPLE_OMP_* statements). The GIMPLE_OMP_TASK case went under my radar, since I was primarily focused on gomp_parallel statements. I'll try these changes and retest. Best regards, Josef Melcr Dne 24. 10. 24 v 18:49 Jakub Jelinek napsal(a): On Thu, Oct 24, 2024 at 04:37:24PM +0200, Josef Melcr wrote: This patch removes a dead if statement checking for gomp-parallel gimple statements. This if is in the execute method of build_cgraph_edges pass, which is executed right after the omp_expand pass, which removes these gimple statements and replaces them with simple gcalls, making this if practically dead. Some TSan tests are failing with this patch, but I don't think this change is likely to cause these failures. Additionally, the failures are not consistent across runs, making me think these failures are a bug in TSan itself. All other tests are ok. Tested on x86_64-pc-linux-gnu. OK for master ? gcc/ChangeLog: * cgraphbuild.cc (pass_build_cgraph_edges::execute): remove if Capital Remove statement checking for gomp-parallel statements The second line should be just tab indented, not tab + 2 spaces, and finished with dot. gomp_parallel rather than gomp-parallel. --- a/gcc/cgraphbuild.cc +++ b/gcc/cgraphbuild.cc @@ -340,12 +340,6 @@ pass_build_cgraph_edges::execute (function *fun) bb->count); } node->record_stmt_references (stmt); - if (gomp_parallel *omp_par_stmt = dyn_cast (stmt)) - { - tree fn = gimple_omp_parallel_child_fn (omp_par_stmt); - node->create_reference (cgraph_node::get_create (fn), - IPA_REF_ADDR, stmt); - } if (gimple_code (stmt) == GIMPLE_OMP_TASK) { tree fn = gimple_omp_task_child_fn (stmt); The if (gimple_code (stmt) == GIMPLE_OMP_TASK) case should go as well. Wonder if gcc_checking_assert (!is_gimple_omp (stmt)); wouldn't be useful as replacement at least for some time to verify it isn't needed (but if it would be needed, we miss various other GIMPLE_OMP_* statements). Jakub smime.p7s Description: Elektronicky podpis S/MIME
[PATCH] cgraph: remove dead if stmt in build_cgraph_edges pass
This patch removes a dead if statement checking for gomp-parallel gimple statements. This if is in the execute method of build_cgraph_edges pass, which is executed right after the omp_expand pass, which removes these gimple statements and replaces them with simple gcalls, making this if practically dead. Some TSan tests are failing with this patch, but I don't think this change is likely to cause these failures. Additionally, the failures are not consistent across runs, making me think these failures are a bug in TSan itself. All other tests are ok. Tested on x86_64-pc-linux-gnu. OK for master ? gcc/ChangeLog: * cgraphbuild.cc (pass_build_cgraph_edges::execute): remove if statement checking for gomp-parallel statements Signed-off-by: Josef Melcr --- gcc/cgraphbuild.cc | 6 -- 1 file changed, 6 deletions(-) diff --git a/gcc/cgraphbuild.cc b/gcc/cgraphbuild.cc index 8852ba9ee92..191ec1077e4 100644 --- a/gcc/cgraphbuild.cc +++ b/gcc/cgraphbuild.cc @@ -340,12 +340,6 @@ pass_build_cgraph_edges::execute (function *fun) bb->count); } node->record_stmt_references (stmt); - if (gomp_parallel *omp_par_stmt = dyn_cast (stmt)) - { - tree fn = gimple_omp_parallel_child_fn (omp_par_stmt); - node->create_reference (cgraph_node::get_create (fn), - IPA_REF_ADDR, stmt); - } if (gimple_code (stmt) == GIMPLE_OMP_TASK) { tree fn = gimple_omp_task_child_fn (stmt); -- 2.47.0
Re: [PATCH 1/8] ipa: Fix jump function copying
Hi! On 11/5/24 12:06, Martin Jambor wrote: +/* Copy information from SRC_JF to DST_JF which correstpond to call graph edges + SRC and DST. */ + +static void +ipa_duplicate_jump_function (cgraph_edge *src, cgraph_edge *dst, +ipa_jump_func *src_jf, ipa_jump_func *dst_jf) I am not sure this function copies over all necessary information, e.g. jump function type. It will work fine when used in the duplication hook, as the dst jump functions are created through new_args->jump_functions = vec_safe_copy (old_args->jump_functions); so all other information is copied over, but I don't think that's the case when using this function on its own. When a jump function is constructed through different means, will the contents of the jump functions still match ? Best regards, Josef Melcr
Re: [PATCH] cgraph: remove dead if stmt in build_cgraph_edges pass
Hi, I've looked at the statements slipping through and they are all atomic loads and stores. What I find strange is these statements don't get expanded only in tests for errors, when the code is not supposed to compile. In all other cases all statements get expanded just fine, including atomic loads and stores. I am kind of at a loss for why that's happening, I was hoping maybe somebody here could provide some insight ? Thanks in advance :) Best regards, Josef Melcr Dne 25. 10. 24 v 7:19 Josef Melcr napsal(a): So I experimented a little and ran the testsuite a few times. While both if statements seem to be dead, the assertion gcc_checking_assert (!is_gimple_omp (stmt)) doesn't actually hold, as adding this assert breaks around 40 omp/oacc tests, so some other statements are definitely slipping through. I would need to dig deeper to see which statements those are. I am not quite sure where that leaves this patch. Best regards, Josef Melcr Dne 24. 10. 24 v 19:45 Josef Melcr napsal(a): Capital Remove The second line should be just tab indented, not tab + 2 spaces, and finished with dot. gomp_parallel rather than gomp-parallel. Sorry about the formatting issues, I didn't notice them. The if (gimple_code (stmt) == GIMPLE_OMP_TASK) case should go as well. Wonder if gcc_checking_assert (!is_gimple_omp (stmt)); wouldn't be useful as replacement at least for some time to verify it isn't needed (but if it would be needed, we miss various other GIMPLE_OMP_* statements). The GIMPLE_OMP_TASK case went under my radar, since I was primarily focused on gomp_parallel statements. I'll try these changes and retest. Best regards, Josef Melcr Dne 24. 10. 24 v 18:49 Jakub Jelinek napsal(a): On Thu, Oct 24, 2024 at 04:37:24PM +0200, Josef Melcr wrote: This patch removes a dead if statement checking for gomp-parallel gimple statements. This if is in the execute method of build_cgraph_edges pass, which is executed right after the omp_expand pass, which removes these gimple statements and replaces them with simple gcalls, making this if practically dead. Some TSan tests are failing with this patch, but I don't think this change is likely to cause these failures. Additionally, the failures are not consistent across runs, making me think these failures are a bug in TSan itself. All other tests are ok. Tested on x86_64-pc-linux-gnu. OK for master ? gcc/ChangeLog: * cgraphbuild.cc (pass_build_cgraph_edges::execute): remove if Capital Remove statement checking for gomp-parallel statements The second line should be just tab indented, not tab + 2 spaces, and finished with dot. gomp_parallel rather than gomp-parallel. --- a/gcc/cgraphbuild.cc +++ b/gcc/cgraphbuild.cc @@ -340,12 +340,6 @@ pass_build_cgraph_edges::execute (function *fun) bb->count); } node->record_stmt_references (stmt); - if (gomp_parallel *omp_par_stmt = dyn_cast (stmt)) - { - tree fn = gimple_omp_parallel_child_fn (omp_par_stmt); - node->create_reference (cgraph_node::get_create (fn), - IPA_REF_ADDR, stmt); - } if (gimple_code (stmt) == GIMPLE_OMP_TASK) { tree fn = gimple_omp_task_child_fn (stmt); The if (gimple_code (stmt) == GIMPLE_OMP_TASK) case should go as well. Wonder if gcc_checking_assert (!is_gimple_omp (stmt)); wouldn't be useful as replacement at least for some time to verify it isn't needed (but if it would be needed, we miss various other GIMPLE_OMP_* statements). Jakub smime.p7s Description: Elektronicky podpis S/MIME
Re: [PATCH] ipa, cgraph: Enable constant propagation to OpenMP kernels
Lambdas have crossed my mind, but I have not yet had the time to look thoroughly into their implementation and the issues they face. I do plan to look into them once I am done with some incremental improvements for the attribute and callback edges, as lambdas seem like a good candidate for this sort of thing, given their use case. Thanks, Josef Melcr On 4/27/25 19:36, Andrew Pinski wrote: On Sun, Apr 27, 2025 at 2:58 AM Josef Melcr wrote: This patch enables constant propagation to outlined OpenMP kernels and improves support for optimizing callback functions in general. It implements the attribute 'callback' as found in clang, though argument numbering is a bit different, as described below. The title says OpenMP, but it can be used for any function which takes a callback argument, such as pthread functions, qsort and others. The attribute 'callback' captures the notion of a function calling one of its arguments with some of its parameters as arguments. An OpenMP example of such function is GOMP_parallel. We implement the attribute with new callgraph edges called 'callback' edges. They are imaginary edges pointing from the caller of the function with the attribute (e.g. caller of GOMP_parallel) to the body function itself (e.g. the outlined OpenMP body). They share their call statement with the edge from which they are derived (direct edge caller -> GOMP_parallel in this case). These edges allow passes such as ipa-cp to the see the hidden call site to the body function and optimize the function accordingly. To illustrate on an example, the body GOMP_parallel looks something like this: void GOMP_parallel (void (*fn) (void *), void *data, /* ... */) { /* ... */ fn (data); /* ... */ } If we extend it with the attribute 'callback(1, 2)', we express that the function calls its first argument and passes it its second argument. This is represented in the call graph in this manner: direct indirect caller -> GOMP_parallel ---> fn | --> fn callback The direct edge is then the parent edge, with all callback edges being the child edges. While constant propagation is the main focus of this patch, callback edges can be useful for different passes (for example, it improves icf for OpenMP kernels), as they allow for address redirection. If the outlined body function gets optimized and cloned, from body_fn to body_fn.optimized, the callback edge allows us to replace the address in the arguments list: GOMP_parallel (body_fn, &data_struct, /* ... */); becomes GOMP_parallel (body_fn.optimized, &data_struct, /* ... */); This redirection is possible for any function with the attribute. This callback attribute implementation is partially compatible with clang's implementation. Its semantics, arguments and argument indexing style are the same, but we represent an unknown argument position with 0 (precedent set by attributes such as 'format'), while clang uses -1 or '?'. We also allow for multiple callback attributes on the same function, while clang only allows one. The attribute allows us to propagate constants into body functions of OpenMP constructs. Currently, GCC won't propagate the value 'c' into the OpenMP body in the following example: int a[100]; void test(int c) { #pragma omp parallel for for (int i = 0; i < c; i++) { if (!__builtin_constant_p(c)) { __builtin_abort(); } a[i] = i; } } int main() { test(100); return a[5] - 5; } With this patch, the body function will get cloned and the constant 'c' will get propagated. Bootstrapped and regtested on x86_64-linux. OK for master? This seems like it could also improve code dealing with C++ lambdas. Have you thought of that? Thanks, Andrew Thanks, Josef Melcr gcc/ChangeLog: * builtin-attrs.def (0): New int list. (ATTR_CALLBACK): Callback attribute identifier. (DEF_CALLBACK_ATTRIBUTE): Macro for callback attribute creation. (GOMP): Attributes for libgomp functions. (OACC): Attribute used for oacc functions. (ATTR_CALLBACK_GOMP_LIST): ATTR_NOTHROW_LIST but with the callback attribute added, used for many libgomp functions. (ATTR_CALLBACK_GOMP_TASK_HELPER_LIST): Helper list for the construction of ATTR_CALLBACK_GOMP_TASK_LIST. (ATTR_CALLBACK_GOMP_TASK_LIST): New attribute list for GOMP_task, includes two callback attributes. (ATTR_CALLBACK_OACC_LIST): Same as ATTR_CALLBACK_GOMP_LIST, used for oacc builtins. * cgraph.cc (cgraph_add_edge_to_call_site_hash): When hashing callback edges, always hash the parent edge. (cgraph_node::get_edge): Always return callback parent edge. (cgraph_edge::set_call_stmt): Add cascade for cal
Re: [PATCH] ipa, cgraph: Enable constant propagation to OpenMP kernels
Dne 28. 04. 25 v 10:58 Jakub Jelinek napsal(a): On Sun, Apr 27, 2025 at 11:56:21AM +0200, Josef Melcr wrote: * builtin-attrs.def (0): New int list. This is just weird. Write * builtin-attrs.def: Add DEF_LIST_INT_INT (0,2). ? +DEF_CALLBACK_ATTRIBUTE(GOMP, 1, 0) +DEF_CALLBACK_ATTRIBUTE(GOMP, 1, 2) +DEF_CALLBACK_ATTRIBUTE(OACC, 2, 4) +DEF_CALLBACK_ATTRIBUTE(GOMP, 3, 0_2) +DEF_ATTR_TREE_LIST(ATTR_CALLBACK_GOMP_LIST, ATTR_CALLBACK, + ATTR_CALLBACK_GOMP_1_2, ATTR_NOTHROW_LIST) +DEF_ATTR_TREE_LIST(ATTR_CALLBACK_GOMP_TASK_HELPER_LIST, ATTR_CALLBACK, + ATTR_CALLBACK_GOMP_1_0, ATTR_NOTHROW_LIST) +DEF_ATTR_TREE_LIST(ATTR_CALLBACK_GOMP_TASK_LIST, ATTR_CALLBACK, + ATTR_CALLBACK_GOMP_3_0_2, ATTR_CALLBACK_GOMP_TASK_HELPER_LIST) +DEF_ATTR_TREE_LIST(ATTR_CALLBACK_OACC_LIST, ATTR_CALLBACK, + ATTR_CALLBACK_OACC_2_4, ATTR_NOTHROW_LIST) Why so many tabs? Can't ATTR_CALLBACK_GOMP_[123]* go below ATTR_CALLBACK_? @@ -465,6 +466,8 @@ const struct attribute_spec c_common_gnu_attributes[] = handle_tm_attribute, NULL }, { "transaction_may_cancel_outer", 0, 0, false, true, false, false, handle_tm_attribute, NULL }, + { "callback", 1, -1, true, false, false, false, + handle_callback_attribute, NULL}, This line should be indented like for the other attributes. + /* We want to work with the parent edge whenever possible. When it comes to + callback edges, a call statement might have multiple callback edges + attached to it. These can be easily obtained from the parent edge instead. Dot should be followed by 2 spaces, not one (seen many times in the patch, both in function comments and in documentation). */ should go on the same line as the last comment line. If that would be after the 80 column limit, wrap line before the last word (so "instead. */" on the last line). + */ + if (e && e->callback) +e = e->get_callback_parent_edge (); + if (n > 100) { call_site_hash = hash_table::create_ggc (120); @@ -837,8 +855,31 @@ cgraph_edge::set_call_stmt (cgraph_edge *e, gcall *new_stmt, return e_indirect ? indirect : direct; } - if (new_direct_callee) -e = make_direct (e, new_direct_callee); + /* Callback edges also need their call stmts changed. + We can use the same flag as for speculative edges. */ Two spaces instead of one. I apologize for the formatting issues, they'll be fixed in the next version. --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -1970,6 +1970,43 @@ declares that @code{my_alloc1} returns 16-byte aligned pointers and that @code{my_alloc2} returns a pointer whose value modulo 32 is equal to 8. +@cindex @code{callback} function attribute +@item callback (@var{function-pos}, @var{...}) +The @code{callback} attribute specifies that a function takes a pointer to +a callback function as one of it's parameters and passes it some of it's own +parameters. For example: Most importantly, I wonder if it is a good idea to use same attribute names as clang when it behaves differently. Yes, it seems the clang folks screwed up the design by having it behave differently from other attributes (nonnull, format, ...), by allowing also argument names instead of numbers, haven't verified but from reading their docs it seems they actually count arguments differently in methods (nonnull, format, ... count this as 1 and user specified is 2+, while my understanding from the docs is that 1+ are the user specified arguments and 0 is this). Sure, for [[gnu::callback (...)]]] we could do whatever we want, but __attribute__((callback(...))) is the problematic one. So, I wonder if either we should implement what clang does (sure, the argument names are a little bit awkward, we'd need special case for the arguments of this attribute where it would parse an identifier or number, and later on we'd remap the identifiers to numbers. Not sure how exactly it works though, because e.g. for nonnull attribute in both compilers one can specify the numbers not just as literals, but also constant expressions: constexpr int a = 2; __attribute__((nonnull (a, a + 2))) void foo (int, void *, void *, void *); void bar (void) { foo (0, nullptr, nullptr, nullptr); } or use different name of the attribute. As for the attribute, I am honestly not too sure about what to do, as clang is not consistent in with its own indexing, be it with the unknown values, or with 'this'. I've tried to remain consistent with GCC's indexing style. I guess I'll leave up to you and the other maintainers to decide. I
[PATCH] ipa, cgraph: Enable constant propagation to OpenMP kernels
This patch enables constant propagation to outlined OpenMP kernels and improves support for optimizing callback functions in general. It implements the attribute 'callback' as found in clang, though argument numbering is a bit different, as described below. The title says OpenMP, but it can be used for any function which takes a callback argument, such as pthread functions, qsort and others. The attribute 'callback' captures the notion of a function calling one of its arguments with some of its parameters as arguments. An OpenMP example of such function is GOMP_parallel. We implement the attribute with new callgraph edges called 'callback' edges. They are imaginary edges pointing from the caller of the function with the attribute (e.g. caller of GOMP_parallel) to the body function itself (e.g. the outlined OpenMP body). They share their call statement with the edge from which they are derived (direct edge caller -> GOMP_parallel in this case). These edges allow passes such as ipa-cp to the see the hidden call site to the body function and optimize the function accordingly. To illustrate on an example, the body GOMP_parallel looks something like this: void GOMP_parallel (void (*fn) (void *), void *data, /* ... */) { /* ... */ fn (data); /* ... */ } If we extend it with the attribute 'callback(1, 2)', we express that the function calls its first argument and passes it its second argument. This is represented in the call graph in this manner: direct indirect caller -> GOMP_parallel ---> fn | --> fn callback The direct edge is then the parent edge, with all callback edges being the child edges. While constant propagation is the main focus of this patch, callback edges can be useful for different passes (for example, it improves icf for OpenMP kernels), as they allow for address redirection. If the outlined body function gets optimized and cloned, from body_fn to body_fn.optimized, the callback edge allows us to replace the address in the arguments list: GOMP_parallel (body_fn, &data_struct, /* ... */); becomes GOMP_parallel (body_fn.optimized, &data_struct, /* ... */); This redirection is possible for any function with the attribute. This callback attribute implementation is partially compatible with clang's implementation. Its semantics, arguments and argument indexing style are the same, but we represent an unknown argument position with 0 (precedent set by attributes such as 'format'), while clang uses -1 or '?'. We also allow for multiple callback attributes on the same function, while clang only allows one. The attribute allows us to propagate constants into body functions of OpenMP constructs. Currently, GCC won't propagate the value 'c' into the OpenMP body in the following example: int a[100]; void test(int c) { #pragma omp parallel for for (int i = 0; i < c; i++) { if (!__builtin_constant_p(c)) { __builtin_abort(); } a[i] = i; } } int main() { test(100); return a[5] - 5; } With this patch, the body function will get cloned and the constant 'c' will get propagated. Bootstrapped and regtested on x86_64-linux. OK for master? Thanks, Josef Melcr gcc/ChangeLog: * builtin-attrs.def (0): New int list. (ATTR_CALLBACK): Callback attribute identifier. (DEF_CALLBACK_ATTRIBUTE): Macro for callback attribute creation. (GOMP): Attributes for libgomp functions. (OACC): Attribute used for oacc functions. (ATTR_CALLBACK_GOMP_LIST): ATTR_NOTHROW_LIST but with the callback attribute added, used for many libgomp functions. (ATTR_CALLBACK_GOMP_TASK_HELPER_LIST): Helper list for the construction of ATTR_CALLBACK_GOMP_TASK_LIST. (ATTR_CALLBACK_GOMP_TASK_LIST): New attribute list for GOMP_task, includes two callback attributes. (ATTR_CALLBACK_OACC_LIST): Same as ATTR_CALLBACK_GOMP_LIST, used for oacc builtins. * cgraph.cc (cgraph_add_edge_to_call_site_hash): When hashing callback edges, always hash the parent edge. (cgraph_node::get_edge): Always return callback parent edge. (cgraph_edge::set_call_stmt): Add cascade for callback edges. (symbol_table::create_edge): Allow callback edges to share the same call statement. (cgraph_edge::make_callback): New method, derives a callback edge this method is called on. (cgraph_edge::get_callback_parent_edge): New method. (cgraph_edge::first_callback_target): New method. (cgraph_edge::next_callback_target): New method. (cgraph_edge::purge_callback_children): New method. (cgraph_edge::redirect_call_stmt_to_callee): Add callback edge redirection, set call statements for child edges when updating the
[PATCH v2] ipa, cgraph: Enable constant propagation to OpenMP kernels
t_call_stmt): Add cascade for callback child edges. (symbol_table::create_edge): Allow callback edges to share the same call statement, initialize new flags. (cgraph_edge::make_callback): New method, derives a new callback edge. (cgraph_edge::get_callback_parent_edge): New method. (cgraph_edge::first_callback_target): Likewise. (cgraph_edge::next_callback_target): Likewise. (cgraph_edge::purge_callback_children): Likewise. (cgraph_edge::redirect_callee): When redirecting a callback edge, redirects its ref as well. (cgraph_edge::redirect_call_stmt_to_callee): Add callback edge redirection, set child call stmt when setting their parent. (cgraph_node::remove_callers): Add cascade for child edges. (cgraph_edge::dump_edge_flags): Add printing for callback flags. (cgraph_node::verify_node): Add sanity checks for callback edges. * cgraph.h: Add new flags and 16 bit callback hash to cgraph_edge class. * cgraphclones.cc (cgraph_edge::clone): Copy over callback data. * ipa-cp.cc (purge_useless_callback_edges): New function, purges callback edges when needed. (ipcp_decision_stage): Call purge_useless_callback_edges. * ipa-fnsummary.cc (ipa_call_summary_t::duplicate): Add an exception for callback pairs. (analyze_function_body): Copy summary from parent to child, update the child's summary. * ipa-inline-analysis.cc (do_estimate_growth_1): Skip callback child edges when estimating growth. * ipa-inline-transform.cc (inline_transform): Add redirection cascade for child edges. * ipa-inline.cc (can_inline_edge_p): Never inline child edges. * ipa-param-manipulation.cc (drop_decl_attribute_if_params_changed_p): New function. (ipa_param_adjustments::build_new_function_type): Add args_modified out parameter. (ipa_param_adjustments::adjust_decl): Drop callback attrs when args are modified. * ipa-param-manipulation.h: Adjust decl of build_new_function_type. * ipa-prop.cc (ipa_duplicate_jump_function): Add declaration. (init_callback_edge_summary): New function. (ipa_compute_jump_functions_for_edge): Add callback edge creation logic. * lto-cgraph.cc (lto_output_edge): Stream out callback data. (input_edge): Input callback data. * omp-builtins.def (BUILT_IN_GOACC_PARALLEL): Use its corresponding callback attr list. (BUILT_IN_GOMP_PARALLEL_LOOP_STATIC): Likewise. (BUILT_IN_GOMP_PARALLEL_LOOP_GUIDED): Likewise. (BUILT_IN_GOMP_PARALLEL_LOOP_NONMONOTONIC_DYNAMIC): Likewise. (BUILT_IN_GOMP_PARALLEL_LOOP_NONMONOTONIC_RUNTIME): Likewise. (BUILT_IN_GOMP_PARALLEL): Likewise. (BUILT_IN_GOMP_PARALLEL_SECTIONS): Likewise. (BUILT_IN_GOMP_TEAMS_REG): Likewise. * tree-core.h (ECF_CB_1_2): New constant for callback(1,2). (ECF_CB_2_4): New constant for callback(2,4). * tree-inline.cc (copy_bb): Copy child edges when copying parent. (redirect_all_calls): Redirect child edges. * tree.cc (set_call_expr_flags): Create callback attrs according to the ECF_CB constants. * attr-callback.h: New file. gcc/c-family/ChangeLog: * c-attribs.cc: Define callback attribute. gcc/fortran/ChangeLog: * f95-lang.cc (ATTR_CALLBACK_GOMP_LIST): New attr list corresponding to the list in builtin-attrs.def. (ATTR_CALLBACK_OACC_LIST): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/ipa/ipcp-cb-spec1.c: New test. * gcc.dg/ipa/ipcp-cb-spec2.c: New test. * gcc.dg/ipa/ipcp-cb1.c: New test. Signed-off-by: Josef Melcr --- gcc/attr-callback.h | 382 +++ gcc/builtin-attrs.def| 14 + gcc/c-family/c-attribs.cc| 3 + gcc/cgraph.cc| 277 +++- gcc/cgraph.h | 42 +++ gcc/cgraphclones.cc | 3 + gcc/fortran/f95-lang.cc | 2 + gcc/ipa-cp.cc| 70 - gcc/ipa-fnsummary.cc | 24 +- gcc/ipa-inline-analysis.cc | 5 + gcc/ipa-inline-transform.cc | 12 +- gcc/ipa-inline.cc| 5 + gcc/ipa-param-manipulation.cc| 37 ++- gcc/ipa-param-manipulation.h | 2 +- gcc/ipa-prop.cc | 101 +- gcc/lto-cgraph.cc| 6 + gcc/omp-builtins.def | 26 +- gcc/testsuite/gcc.dg/ipa/ipcp-cb-spec1.c | 19 ++ gcc/testsuite/gcc.dg/ipa/ipcp-cb-spec2.c | 21 ++ gcc/testsuite/gcc.dg/ipa/ipcp-cb1.c | 25 ++ gcc/tree-core.h | 8 + gcc/tree-inline.cc
Re: [PATCH] ipa, cgraph: Enable constant propagation to OpenMP kernels
Hello, On 7/11/25 6:47 PM, Martin Jambor wrote: Hello, and sorry for a rather late reaction. First and foremost, thanks for the patch, I think it is great to have this finally implemented and although I would like to see a couple things changed, I think the patch is quite close to what could be committed to gcc master. After my first pass through the patch I have the following comments. I admit I have not looked into the issues with GOMP_task (though it looks like it should be just special-cased in the IPA passes but that can wait for later) and I do not have any strong opinion about what to do with the fact that clang already has the callback attribute (I guess I'd aim for a slightly different name for now). No need to apologize :) Regarding the special casing of GOMP_task, I chose this approach as it is simple and requires minimal changes to the rest of the logic. I can rework it to not use the attribute at all, keep it this way for now or exclude it and leave it for a later patch. Do you have a preference for any of these? On Sun, Apr 27 2025, Josef Melcr wrote: This patch enables constant propagation to outlined OpenMP kernels and improves support for optimizing callback functions in general. It implements the attribute 'callback' as found in clang, though argument numbering is a bit different, as described below. The title says OpenMP, but it can be used for any function which takes a callback argument, such as pthread functions, qsort and others. The attribute 'callback' captures the notion of a function calling one of its arguments with some of its parameters as arguments. An OpenMP example of such function is GOMP_parallel. We implement the attribute with new callgraph edges called 'callback' edges. They are imaginary edges pointing from the caller of the function with the attribute (e.g. caller of GOMP_parallel) to the body function itself (e.g. the outlined OpenMP body). They share their call statement with the edge from which they are derived (direct edge caller -> GOMP_parallel in this case). These edges allow passes such as ipa-cp to the see the hidden call site to the body function and optimize the function accordingly. To illustrate on an example, the body GOMP_parallel looks something like this: void GOMP_parallel (void (*fn) (void *), void *data, /* ... */) { /* ... */ fn (data); /* ... */ } If we extend it with the attribute 'callback(1, 2)', we express that the function calls its first argument and passes it its second argument. This is represented in the call graph in this manner: direct indirect caller -> GOMP_parallel ---> fn | --> fn callback The direct edge is then the parent edge, with all callback edges being the child edges. I know this will sound like nit-picking but could we avoid using the terms parent and children? It is not just that I don't think the names capture what the edges are, but mainly I believe that (after this gets in) people reading the code and comments will not immediately think about callbacks when they encounter mentions of parents and children and will be confused. I'd suggest to call the edges "callback-carrying" and "callback" instead but I am opened to other ideas too. I can see the names being confusing. It never occurred to me since I have spent so much time working with these names. I will switch to the names you suggested. While constant propagation is the main focus of this patch, callback edges can be useful for different passes (for example, it improves icf for OpenMP kernels), as they allow for address redirection. If the outlined body function gets optimized and cloned, from body_fn to body_fn.optimized, the callback edge allows us to replace the address in the arguments list: GOMP_parallel (body_fn, &data_struct, /* ... */); becomes GOMP_parallel (body_fn.optimized, &data_struct, /* ... */); This redirection is possible for any function with the attribute. This callback attribute implementation is partially compatible with clang's implementation. Its semantics, arguments and argument indexing style are the same, but we represent an unknown argument position with 0 (precedent set by attributes such as 'format'), while clang uses -1 or '?'. We also allow for multiple callback attributes on the same function, while clang only allows one. The attribute allows us to propagate constants into body functions of OpenMP constructs. Currently, GCC won't propagate the value 'c' into the OpenMP body in the following example: int a[100]; void test(int c) { #pragma omp parallel for for (int i = 0; i < c; i++) { if (!__builtin_constant_p(c)) { __builtin_abort(); } a[i] = i; } } int main() { test(100); return a[5] - 5; } With this pat
Re: [PATCH] ipa, cgraph: Enable constant propagation to OpenMP kernels
On 6/16/25 18:25, Jan Hubicka wrote: On Mon, Jun 16, 2025 at 05:49:19PM +0200, Jan Hubicka wrote: On Wed, Apr 30, 2025 at 08:56:57AM +0200, Jakub Jelinek wrote: On Mon, Apr 28, 2025 at 07:27:31PM +0200, Josef Melcr wrote: As for the attribute, I am honestly not too sure about what to do, as clang is not consistent in with its own indexing, be it with the unknown values, or with 'this'. I've tried to remain consistent with GCC's indexing style. I guess I'll leave up to you and the other maintainers to decide. I can implement clangs version 1:1, put the attribute in our namespace or rename it. I don't mind either way. Another option would be to patch clang to get in line with the rest of its attributes. It seems like the best option to me, as it would make being consistent way easier, but it would be problematic, as all code using this attribute would need to be updated. I'll talk to C/C++ FE maintainers what they think. No progress there so far. Could you perhaps split the attribute side of the patch off and submit something that only handles a few OpenMP/OpenACC builtins for now without that attribute, instead of testing for the attribute test for specific builtins? We can go similar direction as with fnspec. Calling it " callback" with the extra space so it is not user visible but can be used to annotate builtins and once we agree on user facing interface make it public. Sure, that works too. I just wanted to unblock the patch progress. That is a great idea indeed :) Josef, if possible, can you prepare a patch that modifies the name to " callback" and drops the doc/extend.texi documentation of it? We can then proceed with IPA bits. Honza Certainly :) I am sorry about my lack of activity in the last two months, I've been really busy, first with the thesis and now with studying for the state finals. I will hopefully wrap up these academic duties this Thursday and I will finally have the time to resume work on this patch. I will add the space then, drop the documentation, fix the formatting and also track down the segfault Jakub discovered. I will try to get these changes done as soon as possible. Regarding Jakub's earlier response about GOMP_task, I will probably drop the copyfn attribute for now and sort it out incrementally. You also mentioned that GOMP_task should be special-cased and not use the attribute at all, could you please help me understand the reasoning behind that? I am not quite sure why the attribute shouldn't be used in this case. I've also CC'd my personal email, as this is my university address and will lose access to it once I graduate. I am mentioning this now to avoid confusion later on. Best regards, Josef Melcr
[PATCH v3] ipa, cgraph: Enable constant propagation to OpenMP kernels
Hi, This is the 3rd version of this patch. Changes made since v2: - implemented Martin's suggestions, mostly regarding naming - unintentionally removed lines added back - unknown arguments are no longer represented with an identity to the callback-carrying edge, zeroed-out jump functions are used instead - supporting functions moved from attr-callback.h to a new .cc file - fixed a bug in cgraph_edge::set_call_stmt where the callback-carrying edge could be missed - callback_hash renamed to callback_id, index of the callback is now used in place of the hash - each function parameter can now have at most a single callback attribute GOMP_task remains special-cased in the same manner as in v2. Bootstrapped and regtested on x86_64-linux. Best regards, Josef Melcr gcc/ChangeLog: * Makefile.in: Add attr-callback.o to OBJS. * builtin-attrs.def (ATTR_CALLBACK): Callback attr identifier. (DEF_CALLBACK_ATTRIBUTE): Macro for callback attr creation. (GOMP): Attrs for libgomp functions. (OACC): Attrs for oacc functions. (ATTR_CALLBACK_GOMP_LIST): ATTR_NOTHROW_LIST with GOMP callback attr added. (ATTR_CALLBACK_OACC_LIST): ATTR_NOTHROW_LIST with OACC callback attr added. * cgraph.cc (cgraph_add_edge_to_call_site_hash): Always hash the callback-carrying edge. (cgraph_node::get_edge): Always return the callback-carrying edge. (cgraph_edge::set_call_stmt): Add cascade for callback edges, rename update_speculative to update_derived_edges. (symbol_table::create_edge): Allow callback edges to share call statements, initialize new flags. (cgraph_edge::make_callback): New method, derives a new callback edge. (cgraph_edge::get_callback_carrying_edge): New method. (cgraph_edge::first_callback_edge): Likewise. (cgraph_edge::next_callback_edge): Likewise. (cgraph_edge::purge_callback_edges): Likewise. (cgraph_edge::redirect_callee): When redirecting a callback edge, redirect its ref as well. (cgraph_edge::redirect_call_stmt_to_callee): Add callback edge redirection logic, set update_derived_edges to true if redirecting callback-carrying edge. (cgraph_node::remove_callers): Add cascade for callback edges. (cgraph_edge::dump_edge_flags): Add callback flag printing. (cgraph_node::verify_node): Add sanity checks for callback edges. * cgraph.h: Add new flags and 16 bit callback_id to cgraph_edge class. * cgraphclones.cc (cgraph_edge::clone): Copy over callback data. * ipa-cp.cc (purge_useless_callback_edges): New function, deletes callback edges when necessary. (ipcp_decision_stage): Call purge_useless_callback_edges. * ipa-fnsummary.cc (ipa_call_summary_t::duplicate): Add an exception for callback edges. (analyze_function_body): Copy summary from callback-carrying to callback edge. * ipa-inline-analysis.cc (do_estimate_growth_1): Skip callback edges when estimating growth. * ipa-inline-transform.cc (inline_transform): Add redirection cascade for callback edges. * ipa-inline.cc (can_inline_edge_p): Never inline callback edges. * ipa-param-manipulation.cc (drop_decl_attribute_if_params_changed_p): New function. (ipa_param_adjustments::build_new_function_type): Add args_modified out parameter. (ipa_param_adjustments::adjust_decl): Drop callback attrs when modifying args. * ipa-param-manipulation.h: Change decl of build_new_function_type. * ipa-prop.cc (ipa_duplicate_jump_function): Add declaration. (init_callback_edge_summary): New function. (ipa_compute_jump_functions_for_edge): Add callback edge creation logic. * lto-cgraph.cc (lto_output_edge): Stream out callback data. (input_edge): Input callback data. * omp-builtins.def (BUILT_IN_GOACC_PARALLEL): Use new attr list. (BUILT_IN_GOMP_PARALLEL_LOOP_STATIC): Likewise. (BUILT_IN_GOMP_PARALLEL_LOOP_GUIDED): Likewise. (BUILT_IN_GOMP_PARALLEL_LOOP_NONMONOTONIC_DYNAMIC): Likewise. (BUILT_IN_GOMP_PARALLEL_LOOP_NONMONOTONIC_RUNTIME): Likewise. (BUILT_IN_GOMP_PARALLEL): Likewise. (BUILT_IN_GOMP_PARALLEL_SECTIONS): Likewise. (BUILT_IN_GOMP_TEAMS_REG): Likewise. * tree-core.h (ECF_CB_1_2): New constant for callback(1,2). (ECF_CB_2_4): New constant for callback(2,4). * tree-inline.cc (copy_bb): Copy callback edges when copying callback-carrying edge. (redirect_all_calls): Redirect callback edges. * tree.cc (set_call_expr_flags): Create callback attrs according to the ECF_CB constants. * attr-callback.cc: New file. * attr-callback.h: New file. gcc/c-f
[PING^2][PATCH v3] ipa, cgraph: Enable constant propagation to OpenMP kernels
Hi, a gentle ping for this patch: https://gcc.gnu.org/pipermail/gcc-patches/2025-July/691227.html Best regards, Josef Melcr
Re: [PATCH v3] ipa, cgraph: Enable constant propagation to OpenMP kernels
it. +/* Initializes ipa_edge_args summary of CBE given its callback-carrying edge. + This primarily means allocating the correct amount of jump functions. */ + +static inline void +init_callback_edge_summary (struct cgraph_edge *carrying, + struct cgraph_edge *cbe) +{ + ipa_edge_args *carrying_args = ipa_edge_args_sum->get (carrying); + ipa_edge_args *cb_args = ipa_edge_args_sum->get_create (cbe); + vec_safe_grow_cleared (cb_args->jump_functions, +carrying_args->jump_functions->length (), true); why is carrying_args->jump_functions->length () the correct number of jump functions? You need the lenght of the vector callback_get_arg_mapping would return, no? (I'm not suggesting that we construct such a vector here but I think the length needs to be computed from the attribute and not from the number of parameters the dispatching function has.) You are right, that should be calculated from the attribute. It works because the dispatching function in OpenMP/OpenACC usually takes more arguments than the callback. The mistake looks to be isolated to this place, I am using the correct number of jump functions in the rest of the code. The doc comment above the function is pretty ironic though :) Generally, this version is a clear improvement over the previous versions and (with at least the last issue described above fixed), I'd be happy to have the cgraph and ipa bits committed. Someone else needs to approve the rest though - and Honza may want to chime in. I also hope that soon (at Cauldron?) we'll be able to come to some kind of conclusion what to do about the attribute name and behavior. That's what I am hoping for. I'll be giving a short talk about this patch and its future improvements at Cauldron on Sunday with the hopes of coming up with at least some general direction for the attribute. We may also consider enabling the new behavior with a compiler switch (enabled by default at -O2 and higher). But that is also a detail. I don't think that's currently needed, as the new edges are bound to ipa-cp, which is enabled at -O2. Though that might change in the future, we might add some functionality not specific to any pass. Thanks a lot again for working on this. Martin Best regards, Josef Melcr
[PATCH v4] ipa, cgraph: Enable constant propagation to OpenMP kernels
Hi, this is the fourth version of this patch (v3: https://gcc.gnu.org/pipermail/gcc-patches/2025-July/691227.html). Changes since v3: - "offloading" replaced with "callback-dispatching". - Outdated and missing docs fixed. - Superfluous braces removed. - Fixed bug where the number of args of the dispatching function was used instead of the callback's. - Early return removed from cgraph_add_edge_to_call_site_hash, as it seems to be unnecessary. Only comment I've left unaddressed is about droping callback edges when removing the attribute. Removing the edges at that point feels like doing too much stuff in one place, though it can be changed incrementally. Unrelated to this revision, in case somebody missed it, a consensus around the attribute has been reached at Cauldron 2025, the attribute will be renamed to 'callback_only' at some point in the future. Bootstrapped and regtested on x86_64-linux. Best regards, Josef Melcr gcc/ChangeLog: * Makefile.in: Add attr-callback.o to OBJS. * builtin-attrs.def (ATTR_CALLBACK): Callback attr identifier. (DEF_CALLBACK_ATTRIBUTE): Macro for callback attr creation. (GOMP): Attrs for libgomp functions. (OACC): Attrs for oacc functions. (ATTR_CALLBACK_GOMP_LIST): ATTR_NOTHROW_LIST with GOMP callback attr added. (ATTR_CALLBACK_OACC_LIST): ATTR_NOTHROW_LIST with OACC callback attr added. * cgraph.cc (cgraph_add_edge_to_call_site_hash): Always hash the callback-carrying edge. (cgraph_node::get_edge): Always return the callback-carrying edge. (cgraph_edge::set_call_stmt): Add cascade for callback edges, rename update_speculative to update_derived_edges. (symbol_table::create_edge): Allow callback edges to share call statements, initialize new flags. (cgraph_edge::make_callback): New method, derives a new callback edge. (cgraph_edge::get_callback_carrying_edge): New method. (cgraph_edge::first_callback_edge): Likewise. (cgraph_edge::next_callback_edge): Likewise. (cgraph_edge::purge_callback_edges): Likewise. (cgraph_edge::redirect_callee): When redirecting a callback edge, redirect its ref as well. (cgraph_edge::redirect_call_stmt_to_callee): Add callback edge redirection logic, set update_derived edges to true when redirecting callback-carrying edge. (cgraph_node::remove_callers): Add cascade for callback edges. (cgraph_edge::dump_edge_flags): Add callback flag printing. (cgraph_node::verify_node): Add sanity checks for callback edges. * cgraph.h: Add new flags and 16 bit callback_id to cgraph_edge class. * cgraphclones.cc (cgraph_edge::clone): Copy over callback data. * ipa-cp.cc (purge_useless_callback_edges): New function, deletes callback edges when necessary. (ipcp_decision_stage): Call purge_useless_callback_edges. * ipa-fnsummary.cc (ipa_call_summary_t::duplicate): Add an exception for callback edges. (analyze_function_body): Copy summary from carrying to callback edge. * ipa-inline-analysis.cc (do_estimate_growth_1): Skip callback edges when estimating growth. * ipa-inline-transform.cc (inline_transform): Add redirection cascade for callback edges. * ipa-inline.cc (can_inline_edge_p): Never inline callback edges. * ipa-param-manipulation.cc (drop_decl_attribute_if_params_changed_p): New function. (ipa_param_adjustments::build_new_function_type): Add args_modified out parameter. (ipa_param_adjustments::adjust_decl): Drop callback attrs when modifying args. * ipa-param-manipulation.h: Change decl of build_new_function_type. * ipa-prop.cc (ipa_duplicate_jump_function): Add declaration. (init_callback_edge_summary): New function. (ipa_compute_jump_functions_for_edge): Add callback edge creation logic. * lto-cgraph.cc (lto_output_edge): Stream out callback data. (input_edge): Input callback data. * omp-builtins.def (BUILT_IN_GOACC_PARALLEL): Use new attr list. (BUILT_IN_GOMP_PARALLEL_LOOP_STATIC): Likewise. (BUILT_IN_GOMP_PARALLEL_LOOP_GUIDED): Likewise. (BUILT_IN_GOMP_PARALLEL_LOOP_NONMONOTONIC_DYNAMIC): Likewise. (BUILT_IN_GOMP_PARALLEL_LOOP_NONMONOTONIC_RUNTIME): Likewise. (BUILT_IN_GOMP_PARALLEL): Likewise. (BUILT_IN_GOMP_PARALLEL_SECTIONS): Likewise. (BUILT_IN_GOMP_TEAMS_REG): Likewise. * tree-core.h (ECF_CB_1_2): New constant for callback(1,2). (ECF_CB_2_4): New constant for callback(2,4). * tree-inline.cc (copy_bb): Copy callback edges when copying the carrying edge. (redirect_all_calls): Redirect callback edges.
Re: [PATCH] Add support for [[gnu::trivial_abi]] attribute
Hi, I am fairly new myself, but I think I can answer your questions. On 10/8/25 20:03, Yuxuan Chen wrote: 1. For the legal requirements, most likely I will need to use the Developer Certificate of Origin. Does that mean I will just need to do `git commit --signoff`? As far as I am aware, yes, that should be it. 2. Are there any easy ways to format the code? Due to the unfamiliarity I have with the GNU style, conforming to the coding standards takes a long time. For example, my editor is not well set up to take a mixture of tabs and spaces. Is it typical for contributors to manually format the code or is there a script I am missing? There are scripts in contrib that verify your code is formatted correctly. There are also clang-format and editorconfig files, if your editor supports them. Not sure about how other developers format their code, but I use the clang-format config and it works just fine. 3. I tried to run all existing tests with `make && make -C gcc -k check` and there are a small quantity of existing (unexpected) failures. Is that an indication that I don't have the right setup? Not necessarily, it's normal for the testsuite to fail. They way you verify your patches is by running the testsuite once with no changes, then running it again with your patch applied and comparing the results, for example by diffing the outputs of the test_summary script in contrib. 4. For subsequent submissions after these modifications, can I just compute a new patch and attach it in a reply-all to this email chain? What people usually do is they start a new thread for each version of the patch (hence the v2, v3,... you can see in the subjects of some emails sent to this mailing list). You should also document what changes you've made since the last version you sent. Yuxuan Best regards, Josef smime.p7s Description: S/MIME Cryptographic Signature