Re: [RFC] Return Value Propagation in IPA-CP

Jan Hubicka via Gcc Sun, 15 Sep 2024 05:34:56 -0700

> Ping (https://gcc.gnu.org/pipermail/gcc/2024-August/244625.html).
Hi,
> > * Proposed Solution *
> > 
> > * Extending IPA-CP:
> > 
> > 1. Add return jump function data structures
> >    - This involves updating ipa_node_params to contain information 
> > regarding the
> >      return statements of the function, namely the lattice and the jump 
> > function
> >      describing the return value, where both use existing data structures.
> >    - The ipa_node_params class is reused to avoid introducing a new class 
> > and a
> >      corresponding function_summary type, though it is trivial to add if 
> > deemed
> >      a better solution. The ipa_return_value_summary class may also be a 
> > good
> >      place for this information, however using it would involve moving it 
> > from
> >      ipa-prop.cc to ipa-prop.h.


Yes, I think the jump functions for return values are not realy
different from jump functions for call parameters, so indeed we want to
reuse same datatstructure.
We want to track all the usual stuff we do (value ranges, constant
values, known fields in structures...)

In longer run we will want to extend jump functions to represent
properties like "value returned by call X is passed as argument n to
call Y". But for the first version we probably do not need to worry
about this.
> >    - Additionally, ipa_edge_args is updated to track whether or not it is a
> >      callgraph edge originating from a return statement. This enables the
> >      propagation of information in the WPA phase.

What you exactly mean by callgraph edge originating from return
statement?
> > 
> > 2. LGEN
> >    - Set up return jump functions for each function similar to the parameter
> >      jump function. When it cannot be conclusively determined that one 
> > formal
> >      parameter is always returned (such as conditionally returning either of
> >      two), the jump function is marked as invalid.
> >    - This involves looking through phi nodes when return statements of
> >      conditional branches are merged into a single exit block. Note that
> >      returning a call expression does not count towards invalidating the
> >      information for that function.

The analysis is essentially the same as for function parameters, just
looking at the return statements, so I think most of code can be shared
with what we have.
> > 
> > 3. WPA
> >    - Implement return value information propagation. It is not possible to 
> > do
> >      this in the existing propagate_constants_topo function, because it 
> > iterates
> >      in reverse postorder i.e. from caller to callee. However return value
> >      propagation needs to be done from callee to caller, thus there is a 
> > need to
> >      iterate in a postorder fashion.
> >    - The lattice values are initialized using the jump functions computed 
> > in the
> >      LGEN phase. These values are then propagated over the callgraph.
> >    - The existing lattices are reused, with three possible values like 
> > usual.
> >      The possible values are:
> > 
> >        return_lattice -> { TOP, param_decl, BOTTOM }
> > 
> >      where TOP and BOTTOM are defined as usual. param_decl refers to the 
> > tree of
> >      either the parameter declaration or its type, whichever is available. 
> > The
> >      meet operator is defined as usual for TOP and BOTTOM. When both are
> >      param_decl, the following meet operation is defined:
> > 
> >        meet(x, y) = x if x == y, BOTTOM otherwise
> > 
> >    - Finally, nodes to which no information could be propagated are marked 
> > as
> >      BOTTOM and the BOTTOM value is propagated to a fixed-point. This is 
> > required
> >      when information about a caller has been inferred but not about one of 
> > its
> >      callees. For example:
> > 
> >    extern int foo(int);
> > 
> >    __attribute__((noinline))
> >    int bar (int z)
> >    {
> >      return foo (z);
> >    }
> > 
> >    __attribute__((noinline))
> >    int baz (int z)
> >    {
> >      return z;
> >    }
> > 
> >    int quux (int z)
> >    {
> >      if (z & 1)
> >        return bar (z);
> >      else
> >        return baz (z);
> >    }
> > 
> >    In this case, after propagation, the lattice values look like the 
> > following:
> >    - bar/ret  ->  TOP
> >    - baz/ret  ->  baz/0  (first parameter being returned)
> >    - quux/ret ->  quux/0 (lattice partially inferred from baz)
> > 
> >    Here, we are not able to infer any information about bar because it 
> > calls a
> >    function foo that is not visible to the compiler. If we only set bar to
> >    BOTTOM and do not propagate this BOTTOM value to quux, we will 
> > incorrectly
> >    infer that quux always returns its 1st parameter.

Yep, it is essentially a simple dataflow.  propagate_constants_topo
can probably be still saved by simply considering calls with interesting
return functions as bi-directional for discovery of strongly connected
component.
Somewhat tricky will be implementing cloning heuristics, but that is
also a next step.
> > 
> > 4. LTRANS
> >    - The computed information is used to set a field denoting the returned
> >      formal parameter in the ipcp_transformation class. This is streamed 
> > out to
> >      bitcode at the end of WPA and read back in the LTRANS phase.
> >    - This information is currently used to annotate functions with the "fn 
> > spec"
> >      internal annotation in a prototype patch. This is only a stop-gap 
> > measure
> >      given the limitations that exist on "fn spec" (detailed in the next
> >      section).
> > 
> > Consider the following example:
> > 
> > __attribute__((noinline))
> > static int f3 (int x, int y)
> > {
> >    use (y);
> >    return x;
> > }
> > 
> > int f2 (int x, int y)
> > {
> >    return f3 (x, y);
> > }
> > 
> > After LGEN, the jump functions will look like the following:
> > - f3/ret -> f3/0    (f3 returns the 0th parameter, i.e. x)
> > - f2/ret -> unknown (but not invalid, because it returns a call)
> > 
> > Therefore, the lattices are initialized at the start of WPA in the following
> > manner:
> > - f3/lat -> 0
> > - f2/lat -> TOP
> > 
> > Simple propagation gives f2/lat the final value of 0. After LTRANS, with the
> > current patch, the code would end up like the following:
> > 
> > __attribute__((fn spec ("1 "), noinline))
> > static int f3 (int x, int y)
> > {
> >    use (y);
> >    return x;
> > }
> > 
> > __attribute__((fn spec ("1 ")))
> > int f2 (int x, int y)
> > {
> >    return f3 (x, y);
> > }
> > 
> > * Other Options *
> > 
> > * ipa-pure-const pass:
> > 
> > The "ipa-pure-const" pass can also be used to implement this analysis by 
> > using
> > the same local analysis and storing inferred information in the function
> > declaration using the "fn spec" annotation. However, this is quite 
> > restrictive
> > because it does not allow handling cases such as "return x + 1;", where the
> > formal parameter has an additional binary operation with a constant operand.

ipa-prop and ipa-cp is the right place to handle this problem.
> > 
> > The "fn spec" annotation is also quite limited as it only allows tracking
> > returns of up to the 4th formal parameter of the function.
> > 
> > * Existing Solutions *
> > 
> > Patch 53ba8d669550d3a1f809048428b97ca607f95cf5 introduces a new class,
> > ipa_return_value_summary, to store range information of the return value and
> > extends the tree-vrp pass to store inferred information in this class. This 
> > is
> > then used in fold_using_range. However, this only works within a single
> > translation unit.

Yep, it is useful to have the local propagation since there are cases we
can optimize before IPA passes cheaply, but we do want to have full
IPA-CP implementation at WPA stage too.

We can probably put the results to ipa_return_value_summary and make it
similar to summaries produced by ipa-prop.
> > 
> > - Does this sound reasonable?

Yes, it would be really nice to have this.
> > - Would it be better to reuse ipa_node_params, ipa_return_value_summary, or
> >    introduce a new class entirely?

I would try to share datastructures since we will be adding more info
over time.
> > - What are other optimizations that could be extended with this information?

ipa-prop propagates aggreagete values that are interesting here too.
Also polymorphic call contextes can probably be handled.
> > - Could the return value range information be improved in any other way?
> > - Should build_int_cst be used to create the value stored in the lattice,
> >    instead of the tree corresponding to the parameter?
> > - Should a new class be created to correspond to return jump functions,
> >    something like ipa_return_jump_func?
> > - Should a constant jump function (instead of a pass-through one) be used to
> >    represent cases where the returned parameter is known?

How the return jump function would differ from the class we have for
(argument) jump functions?

Honza
> > 
> > -- 
> > Regards,
> > Dhruv
> > 
> 
> -- 
> Regards,
> Dhruv
>

Re: [RFC] Return Value Propagation in IPA-CP

Reply via email to