On Wed, Jun 12, 2024 at 11:57 AM Hanke Zhang <hkzhang...@gmail.com> wrote: > > Richard Biener <richard.guent...@gmail.com> 于2024年5月24日周五 14:39写道: > > > > On Fri, May 24, 2024 at 5:53 AM Hanke Zhang via Gcc <gcc@gcc.gnu.org> wrote: > > > > > > Hi, > > > I got a question about optimizing function pointers for direct > > > function calls in C. > > > > > > Consider the following scenario: one of the fields of a structure is a > > > function pointer, and all its assignments come from the same function. > > > Can all its uses be replaced by direct calls to this function? So the > > > later passes can do more optimizations. > > > > > > Here is the example: > > > > > > int add(int a, int b) { return a + b; } > > > int sub(int a, int b) { return a - b; } > > > > > > struct Foo { > > > int (*add)(int, int); > > > }; > > > int main() > > > { > > > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5); > > > > > > for (int i = 0; i < 5; i++) { > > > foo[i].add = add; > > > } > > > > > > int sum = 0; > > > for (int i = 0; i < 5; i++) { > > > sum += foo[i].add(1, 2); > > > } > > > > > > return 0; > > > } > > > > > > Can I replace the code above to the code below? > > > > > > int add(int a, int b) { return a + b; } > > > int sub(int a, int b) { return a - b; } > > > > > > struct Foo { > > > int (*add)(int, int); > > > }; > > > int main() > > > { > > > struct Foo foo[5] = malloc(sizeof(struct Foo) * 5); > > > > > > for (int i = 0; i < 5; i++) { > > > foo[i].add = add; > > > } > > > > > > int sum = 0; > > > for (int i = 0; i < 5; i++) { > > > sum += add(1,2); > > > } > > > > > > return 0; > > > } > > > > > > My idea is to determine whether the assignment of the field is the > > > same function, and if so, perform the replacement. > > > > If it's as simple as above then sure, even CSE should do it. If you > > can prove otherwise the memory location with the function pointer > > always has the same value you are obviously fine. If you just > > do not see any other store via 'add's FIELD_DECL then no, that > > isn't good enough. Every void * store you do not know where it > > goes might go to that slot. > > > > > Of course this is not a reasonable optimization, I just want to know > > > if there are security issues in doing so, and if I want to do it in > > > the IPA stage, is it possible? > > > > For the more general case you can do what we do for speculative > > devirtualization - replace the code with > > > > sum += foo[i].add == add ? add (1,2) : foo[i].add(1,2); > > Hi Richard, > > I'm trying to do what you suggested. (sum += foo[i].add == add ? add > (1,2) : foo[i].add(1,2);) > > I created a new IPA-Pass before IPA-INLINE and I made the changes on > the GIMPLE in the "function_transform" stage. But my newly created > "foo[i].add(1,2)" seems to fail to be inlined. And it was optimized > out in the subsequent FRE. I would like to ask if there is any way to > mark my newly created cgraph_edge as "inline" in the > "function_transform" stage. > > Here is part of the code creating the call_stmt in true basic_block in > my IPA-PASS: > > // ... > // part of the code have been omitted > auto_vec<tree> params; > for (int i = 0; i < gimple_call_num_args (call_stmt); i++) { > params.safe_push (gimple_call_arg (call_stmt, i)); > } > gimple* tbb_call = gimple_build_call_vec (fndecl, params); /// = fn() > tree tbb_ssa; > if (ret_val_flag) { > tbb_ssa = make_ssa_name (TREE_TYPE (gimple_call_lhs(call_stmt)), NULL); > gimple_call_set_lhs (tbb_call, tbb_ssa); /// _2 = fn() > } > gsi = gsi_start_bb (tbb); > gsi_insert_before (&gsi, tbb_call, GSI_SAME_STMT); > cgraph_edge* tbb_callee = node->create_edge (cgraph_node::get_create > (fndecl), (gcall*)tbb_call, tbb_call->bb->count, true); > // what should I do to this cgraph_edge to mark it to be inlined > // ... > > Or should I not do these things in the function_transform stage?
That's indeed too late. You have to create speculative call edges, I would suggest to look how IPA devirtualization handles this case and possibly simply amend it. I'm CCing Honza who should know more about this, I do not know the details. Richard. > Thanks > Hanke Zhang > > > > > > that way we can inline the direct call and hopefully the branch will be > > well predicted. > > > > In some SPEC there is IIRC the situation where such speculative > > devirtualization candidates can be found solely based on function > > signature. With LTO/IPA you'd basically collect candidate targets > > for each indirect call (possibly address-taken function definitions > > with correct signature) and if there's just a single one you can > > choose that as speculative devirt target. > > > > Speculative devirt as we have now of course works with profile > > data to identify the most probable candidate. > > > > Richard. > > > > > > > > Thanks > > > Hanke Zhang