On 1 September 2017 at 08:09, Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> wrote: > On 17 August 2017 at 18:02, Prathamesh Kulkarni > <prathamesh.kulka...@linaro.org> wrote: >> On 8 August 2017 at 09:50, Prathamesh Kulkarni >> <prathamesh.kulka...@linaro.org> wrote: >>> On 31 July 2017 at 23:53, Prathamesh Kulkarni >>> <prathamesh.kulka...@linaro.org> wrote: >>>> On 23 May 2017 at 19:10, Prathamesh Kulkarni >>>> <prathamesh.kulka...@linaro.org> wrote: >>>>> On 19 May 2017 at 19:02, Jan Hubicka <hubi...@ucw.cz> wrote: >>>>>>> >>>>>>> * LTO and memory management >>>>>>> This is a general question about LTO and memory management. >>>>>>> IIUC the following sequence takes place during normal LTO: >>>>>>> LGEN: generate_summary, write_summary >>>>>>> WPA: read_summary, execute ipa passes, write_opt_summary >>>>>>> >>>>>>> So I assumed it was OK in LGEN to allocate return_callees_map in >>>>>>> generate_summary and free it in write_summary and during WPA, allocate >>>>>>> return_callees_map in read_summary and free it after execute (since >>>>>>> write_opt_summary does not require return_callees_map). >>>>>>> >>>>>>> However with fat LTO, it seems the sequence changes for LGEN with >>>>>>> execute phase takes place after write_summary. However since >>>>>>> return_callees_map is freed in pure_const_write_summary and >>>>>>> propagate_malloc() accesses it in execute stage, it results in >>>>>>> segmentation fault. >>>>>>> >>>>>>> To work around this, I am using the following hack in >>>>>>> pure_const_write_summary: >>>>>>> // FIXME: Do not free if -ffat-lto-objects is enabled. >>>>>>> if (!global_options.x_flag_fat_lto_objects) >>>>>>> free_return_callees_map (); >>>>>>> Is there a better approach for handling this ? >>>>>> >>>>>> I think most passes just do not free summaries with -flto. We probably >>>>>> want >>>>>> to fix it to make it possible to compile multiple units i.e. from plugin >>>>>> by >>>>>> adding release_summaries method... >>>>>> So I would say it is OK to do the same as others do and leak it with >>>>>> -flto. >>>>>>> diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c >>>>>>> index e457166ea39..724c26e03f6 100644 >>>>>>> --- a/gcc/ipa-pure-const.c >>>>>>> +++ b/gcc/ipa-pure-const.c >>>>>>> @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3. If not see >>>>>>> #include "tree-scalar-evolution.h" >>>>>>> #include "intl.h" >>>>>>> #include "opts.h" >>>>>>> +#include "ssa.h" >>>>>>> >>>>>>> /* Lattice values for const and pure functions. Everything starts out >>>>>>> being const, then may drop to pure and then neither depending on >>>>>>> @@ -69,6 +70,15 @@ enum pure_const_state_e >>>>>>> >>>>>>> const char *pure_const_names[3] = {"const", "pure", "neither"}; >>>>>>> >>>>>>> +enum malloc_state_e >>>>>>> +{ >>>>>>> + PURE_CONST_MALLOC_TOP, >>>>>>> + PURE_CONST_MALLOC, >>>>>>> + PURE_CONST_MALLOC_BOTTOM >>>>>>> +}; >>>>>> >>>>>> It took me a while to work out what PURE_CONST means here :) >>>>>> I would just call it something like STATE_MALLOC_TOP... or so. >>>>>> ipa_pure_const is outdated name from the time pass was doing only >>>>>> those two. >>>>>>> @@ -109,6 +121,10 @@ typedef struct funct_state_d * funct_state; >>>>>>> >>>>>>> static vec<funct_state> funct_state_vec; >>>>>>> >>>>>>> +/* A map from node to subset of callees. The subset contains those >>>>>>> callees >>>>>>> + * whose return-value is returned by the node. */ >>>>>>> +static hash_map< cgraph_node *, vec<cgraph_node *>* > >>>>>>> *return_callees_map; >>>>>>> + >>>>>> >>>>>> Hehe, a special case of return jump function. We ought to support those >>>>>> more generally. >>>>>> How do you keep it up to date over callgraph changes? >>>>>>> @@ -921,6 +1055,23 @@ end: >>>>>>> if (TREE_NOTHROW (decl)) >>>>>>> l->can_throw = false; >>>>>>> >>>>>>> + if (ipa) >>>>>>> + { >>>>>>> + vec<cgraph_node *> v = vNULL; >>>>>>> + l->malloc_state = PURE_CONST_MALLOC_BOTTOM; >>>>>>> + if (DECL_IS_MALLOC (decl)) >>>>>>> + l->malloc_state = PURE_CONST_MALLOC; >>>>>>> + else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), v)) >>>>>>> + { >>>>>>> + l->malloc_state = PURE_CONST_MALLOC_TOP; >>>>>>> + vec<cgraph_node *> *callees_p = new vec<cgraph_node *> (vNULL); >>>>>>> + for (unsigned i = 0; i < v.length (); ++i) >>>>>>> + callees_p->safe_push (v[i]); >>>>>>> + return_callees_map->put (fn, callees_p); >>>>>>> + } >>>>>>> + v.release (); >>>>>>> + } >>>>>>> + >>>>>> >>>>>> I would do non-ipa variant, too. I think most attributes can be >>>>>> detected that way >>>>>> as well. >>>>>> >>>>>> The patch generally makes sense to me. It would be nice to make it >>>>>> easier to write such >>>>>> a basic propagators across callgraph (perhaps adding a template doing >>>>>> the basic >>>>>> propagation logic). Also I think you need to solve the problem with >>>>>> keeping your >>>>>> summaries up to date across callgraph node removal and duplications. >>>>> Thanks for the suggestions, I will try to address them in a follow-up >>>>> patch. >>>>> IIUC, I would need to modify ipa-pure-const cgraph hooks - >>>>> add_new_function, remove_node_data, duplicate_node_data >>>>> to keep return_callees_map up-to-date across callgraph node insertions >>>>> and removal ? >>>>> >>>>> Also, if instead of having a separate data-structure like >>>>> return_callees_map, >>>>> should we rather have a flag within cgraph_edge, which marks that the >>>>> caller may return the value of the callee ? >>>> Hi, >>>> Sorry for the very late response. I have attached an updated version >>>> of the prototype patch, >>>> which adds a non-ipa variant, and keeps return_callees_map up-to-date >>>> across callgraph >>>> node insertions and removal. For the non-ipa variant, >>>> malloc_candidate_p() additionally checks >>>> that all the "return callees" have DECL_IS_MALLOC set to true. >>>> Bootstrapped+tested and LTO bootstrapped+tested on >>>> x86_64-unknown-linux-gnu. >>>> Does it look OK so far ? >>>> >>>> Um sorry for this silly question, but I don't really understand how >>>> does indirect call propagation >>>> work in ipa-pure-const ? For example consider propagation of nothrow >>>> attribute in following >>>> test-case: >>>> >>>> __attribute__((noinline, noclone, nothrow)) >>>> int f1(int k) { return k; } >>>> >>>> __attribute__((noinline, noclone)) >>>> static int foo(int (*p)(int)) >>>> { >>>> return p(10); >>>> } >>>> >>>> __attribute__((noinline, noclone)) >>>> int bar(void) >>>> { >>>> return foo(f1); >>>> } >>>> >>>> Shouldn't foo and bar be also marked as nothrow ? >>>> Since foo indirectly calls f1 which is nothrow and bar only calls foo ? >>>> The local-pure-const2 dump shows function is locally throwing for >>>> "foo" and "bar". >>>> >>>> Um, I was wondering how to get "points-to" analysis for function-pointers, >>>> to get list of callees that may be indirectly called from that >>>> function pointer ? >>>> In the patch I just set node to bottom if it contains indirect calls >>>> which is far from ideal :( >>>> I would be much grateful for suggestions on how to handle indirect calls. >>>> Thanks! >>> ping https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02063.html >> ping * 2 https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02063.html > ping * 3 https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02063.html ping * 4 https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02063.html
Thanks, Prathamesh > > Thanks, > Prathamesh >> >> Thanks, >> Prathamesh >>> >>> Thanks, >>> Prathamesh >>>> >>>> Regards, >>>> Prathamesh >>>>> >>>>> Thanks, >>>>> Prathamesh >>>>>> >>>>>> Honza