On Wed, Jun 12, 2024 at 11:57 AM Hanke Zhang <hkzhang...@gmail.com> wrote:
>
> Richard Biener <richard.guent...@gmail.com> 于2024年5月24日周五 14:39写道:
> >
> > On Fri, May 24, 2024 at 5:53 AM Hanke Zhang via Gcc <gcc@gcc.gnu.org> wrote:
> > >
> > > Hi,
> > > I got a question about optimizing function pointers for direct
> > > function calls in C.
> > >
> > > Consider the following scenario: one of the fields of a structure is a
> > > function pointer, and all its assignments come from the same function.
> > > Can all its uses be replaced by direct calls to this function? So the
> > > later passes can do more optimizations.
> > >
> > > Here is the example:
> > >
> > > int add(int a, int b) { return a + b; }
> > > int sub(int a, int b) { return a - b; }
> > >
> > > struct Foo {
> > >     int (*add)(int, int);
> > > };
> > > int main()
> > > {
> > >     struct Foo foo[5] = malloc(sizeof(struct Foo) * 5);
> > >
> > >     for (int i = 0; i < 5; i++) {
> > >         foo[i].add = add;
> > >     }
> > >
> > >     int sum = 0;
> > >     for (int i = 0; i < 5; i++) {
> > >         sum += foo[i].add(1, 2);
> > >     }
> > >
> > >     return 0;
> > > }
> > >
> > > Can I replace the code above to the code below?
> > >
> > > int add(int a, int b) { return a + b; }
> > > int sub(int a, int b) { return a - b; }
> > >
> > > struct Foo {
> > >     int (*add)(int, int);
> > > };
> > > int main()
> > > {
> > >     struct Foo foo[5] = malloc(sizeof(struct Foo) * 5);
> > >
> > >     for (int i = 0; i < 5; i++) {
> > >         foo[i].add = add;
> > >     }
> > >
> > >     int sum = 0;
> > >     for (int i = 0; i < 5; i++) {
> > >         sum += add(1,2);
> > >     }
> > >
> > >     return 0;
> > > }
> > >
> > > My idea is to determine whether the assignment of the field is the
> > > same function, and if so, perform the replacement.
> >
> > If it's as simple as above then sure, even CSE should do it.  If you
> > can prove otherwise the memory location with the function pointer
> > always has the same value you are obviously fine.  If you just
> > do not see any other store via 'add's FIELD_DECL then no, that
> > isn't good enough.  Every void * store you do not know where it
> > goes might go to that slot.
> >
> > > Of course this is not a reasonable optimization, I just want to know
> > > if there are security issues in doing so, and if I want to do it in
> > > the IPA stage, is it possible?
> >
> > For the more general case you can do what we do for speculative
> > devirtualization - replace the code with
> >
> >       sum += foo[i].add == add ? add (1,2) : foo[i].add(1,2);
>
> Hi Richard,
>
> I'm trying to do what you suggested. (sum += foo[i].add == add ? add
> (1,2) : foo[i].add(1,2);)
>
> I created a new IPA-Pass before IPA-INLINE and I made the changes on
> the GIMPLE in the "function_transform" stage. But my newly created
> "foo[i].add(1,2)" seems to fail to be inlined. And it was optimized
> out in the subsequent FRE. I would like to ask if there is any way to
> mark my newly created cgraph_edge as "inline" in the
> "function_transform" stage.
>
> Here is part of the code creating the call_stmt in true basic_block in
> my IPA-PASS:
>
> // ...
> // part of the code have been omitted
> auto_vec<tree> params;
> for (int i = 0; i < gimple_call_num_args (call_stmt); i++) {
>   params.safe_push (gimple_call_arg (call_stmt, i));
> }
> gimple* tbb_call = gimple_build_call_vec (fndecl, params); /// = fn()
> tree tbb_ssa;
> if (ret_val_flag) {
>   tbb_ssa = make_ssa_name (TREE_TYPE (gimple_call_lhs(call_stmt)), NULL);
>   gimple_call_set_lhs (tbb_call, tbb_ssa); /// _2 = fn()
> }
> gsi = gsi_start_bb (tbb);
> gsi_insert_before (&gsi, tbb_call, GSI_SAME_STMT);
> cgraph_edge* tbb_callee = node->create_edge (cgraph_node::get_create
> (fndecl), (gcall*)tbb_call, tbb_call->bb->count, true);
> // what should I do to this cgraph_edge to mark it to be inlined
> // ...
>
> Or should I not do these things in the function_transform stage?

That's indeed too late.  You have to create speculative call edges,
I would suggest to look how IPA devirtualization handles this case
and possibly simply amend it.  I'm CCing Honza who should know
more about this, I do not know the details.

Richard.

> Thanks
> Hanke Zhang
>
>
> >
> > that way we can inline the direct call and hopefully the branch will be
> > well predicted.
> >
> > In some SPEC there is IIRC the situation where such speculative
> > devirtualization candidates can be found solely based on function
> > signature.  With LTO/IPA you'd basically collect candidate targets
> > for each indirect call (possibly address-taken function definitions
> > with correct signature) and if there's just a single one you can
> > choose that as speculative devirt target.
> >
> > Speculative devirt as we have now of course works with profile
> > data to identify the most probable candidate.
> >
> > Richard.
> >
> > >
> > > Thanks
> > > Hanke Zhang

Reply via email to