Richard Biener <richard.guent...@gmail.com> 于2024年5月24日周五 14:39写道: > > On Fri, May 24, 2024 at 5:53 AM Hanke Zhang via Gcc <gcc@gcc.gnu.org> wrote: > > > > Hi, > > I got a question about optimizing function pointers for direct > > function calls in C. > > > > Consider the following scenario: one of the fields of a structure is a > > function pointer, and all its assignments come from the same function. > > Can all its uses be replaced by direct calls to this function? So the > > later passes can do more optimizations. > > > > Here is the example: > > > > int add(int a, int b) { return a + b; } > > int sub(int a, int b) { return a - b; } > > > > struct Foo { > > int (*add)(int, int); > > }; > > int main() > > { > > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5); > > > > for (int i = 0; i < 5; i++) { > > foo[i].add = add; > > } > > > > int sum = 0; > > for (int i = 0; i < 5; i++) { > > sum += foo[i].add(1, 2); > > } > > > > return 0; > > } > > > > Can I replace the code above to the code below? > > > > int add(int a, int b) { return a + b; } > > int sub(int a, int b) { return a - b; } > > > > struct Foo { > > int (*add)(int, int); > > }; > > int main() > > { > > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5); > > > > for (int i = 0; i < 5; i++) { > > foo[i].add = add; > > } > > > > int sum = 0; > > for (int i = 0; i < 5; i++) { > > sum += add(1,2); > > } > > > > return 0; > > } > > > > My idea is to determine whether the assignment of the field is the > > same function, and if so, perform the replacement. > > If it's as simple as above then sure, even CSE should do it. If you > can prove otherwise the memory location with the function pointer > always has the same value you are obviously fine. If you just > do not see any other store via 'add's FIELD_DECL then no, that > isn't good enough. Every void * store you do not know where it > goes might go to that slot. > > > Of course this is not a reasonable optimization, I just want to know > > if there are security issues in doing so, and if I want to do it in > > the IPA stage, is it possible? > > For the more general case you can do what we do for speculative > devirtualization - replace the code with > > sum += foo[i].add == add ? add (1,2) : foo[i].add(1,2);
Hi Richard, I'm trying to do what you suggested. (sum += foo[i].add == add ? add (1,2) : foo[i].add(1,2);) I created a new IPA-Pass before IPA-INLINE and I made the changes on the GIMPLE in the "function_transform" stage. But my newly created "foo[i].add(1,2)" seems to fail to be inlined. And it was optimized out in the subsequent FRE. I would like to ask if there is any way to mark my newly created cgraph_edge as "inline" in the "function_transform" stage. Here is part of the code creating the call_stmt in true basic_block in my IPA-PASS: // ... // part of the code have been omitted auto_vec<tree> params; for (int i = 0; i < gimple_call_num_args (call_stmt); i++) { params.safe_push (gimple_call_arg (call_stmt, i)); } gimple* tbb_call = gimple_build_call_vec (fndecl, params); /// = fn() tree tbb_ssa; if (ret_val_flag) { tbb_ssa = make_ssa_name (TREE_TYPE (gimple_call_lhs(call_stmt)), NULL); gimple_call_set_lhs (tbb_call, tbb_ssa); /// _2 = fn() } gsi = gsi_start_bb (tbb); gsi_insert_before (&gsi, tbb_call, GSI_SAME_STMT); cgraph_edge* tbb_callee = node->create_edge (cgraph_node::get_create (fndecl), (gcall*)tbb_call, tbb_call->bb->count, true); // what should I do to this cgraph_edge to mark it to be inlined // ... Or should I not do these things in the function_transform stage? Thanks Hanke Zhang > > that way we can inline the direct call and hopefully the branch will be > well predicted. > > In some SPEC there is IIRC the situation where such speculative > devirtualization candidates can be found solely based on function > signature. With LTO/IPA you'd basically collect candidate targets > for each indirect call (possibly address-taken function definitions > with correct signature) and if there's just a single one you can > choose that as speculative devirt target. > > Speculative devirt as we have now of course works with profile > data to identify the most probable candidate. > > Richard. > > > > > Thanks > > Hanke Zhang