On 1 September 2017 at 08:09, Prathamesh Kulkarni
<prathamesh.kulka...@linaro.org> wrote:
> On 17 August 2017 at 18:02, Prathamesh Kulkarni
> <prathamesh.kulka...@linaro.org> wrote:
>> On 8 August 2017 at 09:50, Prathamesh Kulkarni
>> <prathamesh.kulka...@linaro.org> wrote:
>>> On 31 July 2017 at 23:53, Prathamesh Kulkarni
>>> <prathamesh.kulka...@linaro.org> wrote:
>>>> On 23 May 2017 at 19:10, Prathamesh Kulkarni
>>>> <prathamesh.kulka...@linaro.org> wrote:
>>>>> On 19 May 2017 at 19:02, Jan Hubicka <hubi...@ucw.cz> wrote:
>>>>>>>
>>>>>>> * LTO and memory management
>>>>>>> This is a general question about LTO and memory management.
>>>>>>> IIUC the following sequence takes place during normal LTO:
>>>>>>> LGEN: generate_summary, write_summary
>>>>>>> WPA: read_summary, execute ipa passes, write_opt_summary
>>>>>>>
>>>>>>> So I assumed it was OK in LGEN to allocate return_callees_map in
>>>>>>> generate_summary and free it in write_summary and during WPA, allocate
>>>>>>> return_callees_map in read_summary and free it after execute (since
>>>>>>> write_opt_summary does not require return_callees_map).
>>>>>>>
>>>>>>> However with fat LTO, it seems the sequence changes for LGEN with
>>>>>>> execute phase takes place after write_summary. However since
>>>>>>> return_callees_map is freed in pure_const_write_summary and
>>>>>>> propagate_malloc() accesses it in execute stage, it results in
>>>>>>> segmentation fault.
>>>>>>>
>>>>>>> To work around this, I am using the following hack in 
>>>>>>> pure_const_write_summary:
>>>>>>> // FIXME: Do not free if -ffat-lto-objects is enabled.
>>>>>>> if (!global_options.x_flag_fat_lto_objects)
>>>>>>>   free_return_callees_map ();
>>>>>>> Is there a better approach for handling this ?
>>>>>>
>>>>>> I think most passes just do not free summaries with -flto.  We probably 
>>>>>> want
>>>>>> to fix it to make it possible to compile multiple units i.e. from plugin 
>>>>>> by
>>>>>> adding release_summaries method...
>>>>>> So I would say it is OK to do the same as others do and leak it with 
>>>>>> -flto.
>>>>>>> diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
>>>>>>> index e457166ea39..724c26e03f6 100644
>>>>>>> --- a/gcc/ipa-pure-const.c
>>>>>>> +++ b/gcc/ipa-pure-const.c
>>>>>>> @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
>>>>>>>  #include "tree-scalar-evolution.h"
>>>>>>>  #include "intl.h"
>>>>>>>  #include "opts.h"
>>>>>>> +#include "ssa.h"
>>>>>>>
>>>>>>>  /* Lattice values for const and pure functions.  Everything starts out
>>>>>>>     being const, then may drop to pure and then neither depending on
>>>>>>> @@ -69,6 +70,15 @@ enum pure_const_state_e
>>>>>>>
>>>>>>>  const char *pure_const_names[3] = {"const", "pure", "neither"};
>>>>>>>
>>>>>>> +enum malloc_state_e
>>>>>>> +{
>>>>>>> +  PURE_CONST_MALLOC_TOP,
>>>>>>> +  PURE_CONST_MALLOC,
>>>>>>> +  PURE_CONST_MALLOC_BOTTOM
>>>>>>> +};
>>>>>>
>>>>>> It took me a while to work out what PURE_CONST means here :)
>>>>>> I would just call it something like STATE_MALLOC_TOP... or so.
>>>>>> ipa_pure_const is outdated name from the time pass was doing only
>>>>>> those two.
>>>>>>> @@ -109,6 +121,10 @@ typedef struct funct_state_d * funct_state;
>>>>>>>
>>>>>>>  static vec<funct_state> funct_state_vec;
>>>>>>>
>>>>>>> +/* A map from node to subset of callees. The subset contains those 
>>>>>>> callees
>>>>>>> + * whose return-value is returned by the node. */
>>>>>>> +static hash_map< cgraph_node *, vec<cgraph_node *>* > 
>>>>>>> *return_callees_map;
>>>>>>> +
>>>>>>
>>>>>> Hehe, a special case of return jump function.  We ought to support those 
>>>>>> more generally.
>>>>>> How do you keep it up to date over callgraph changes?
>>>>>>> @@ -921,6 +1055,23 @@ end:
>>>>>>>    if (TREE_NOTHROW (decl))
>>>>>>>      l->can_throw = false;
>>>>>>>
>>>>>>> +  if (ipa)
>>>>>>> +    {
>>>>>>> +      vec<cgraph_node *> v = vNULL;
>>>>>>> +      l->malloc_state = PURE_CONST_MALLOC_BOTTOM;
>>>>>>> +      if (DECL_IS_MALLOC (decl))
>>>>>>> +     l->malloc_state = PURE_CONST_MALLOC;
>>>>>>> +      else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), v))
>>>>>>> +     {
>>>>>>> +       l->malloc_state = PURE_CONST_MALLOC_TOP;
>>>>>>> +       vec<cgraph_node *> *callees_p = new vec<cgraph_node *> (vNULL);
>>>>>>> +       for (unsigned i = 0; i < v.length (); ++i)
>>>>>>> +         callees_p->safe_push (v[i]);
>>>>>>> +       return_callees_map->put (fn, callees_p);
>>>>>>> +     }
>>>>>>> +      v.release ();
>>>>>>> +    }
>>>>>>> +
>>>>>>
>>>>>> I would do non-ipa variant, too.  I think most attributes can be 
>>>>>> detected that way
>>>>>> as well.
>>>>>>
>>>>>> The patch generally makes sense to me.  It would be nice to make it 
>>>>>> easier to write such
>>>>>> a basic propagators across callgraph (perhaps adding a template doing 
>>>>>> the basic
>>>>>> propagation logic). Also I think you need to solve the problem with 
>>>>>> keeping your
>>>>>> summaries up to date across callgraph node removal and duplications.
>>>>> Thanks for the suggestions, I will try to address them in a follow-up 
>>>>> patch.
>>>>> IIUC, I would need to modify ipa-pure-const cgraph hooks -
>>>>> add_new_function, remove_node_data, duplicate_node_data
>>>>> to keep return_callees_map up-to-date across callgraph node insertions
>>>>> and removal ?
>>>>>
>>>>> Also, if instead of having a separate data-structure like 
>>>>> return_callees_map,
>>>>> should we rather have a flag within cgraph_edge, which marks that the
>>>>> caller may return the value of the callee ?
>>>> Hi,
>>>> Sorry for the very late response. I have attached an updated version
>>>> of the prototype patch,
>>>> which adds a non-ipa variant, and keeps return_callees_map up-to-date
>>>> across callgraph
>>>> node insertions and removal. For the non-ipa variant,
>>>> malloc_candidate_p() additionally checks
>>>> that all the "return callees" have DECL_IS_MALLOC set to true.
>>>> Bootstrapped+tested and LTO bootstrapped+tested on 
>>>> x86_64-unknown-linux-gnu.
>>>> Does it look OK so far ?
>>>>
>>>> Um sorry for this silly question, but I don't really understand how
>>>> does indirect call propagation
>>>> work in ipa-pure-const ? For example consider propagation of nothrow
>>>> attribute in following
>>>> test-case:
>>>>
>>>> __attribute__((noinline, noclone, nothrow))
>>>> int f1(int k) { return k; }
>>>>
>>>> __attribute__((noinline, noclone))
>>>> static int foo(int (*p)(int))
>>>> {
>>>>   return p(10);
>>>> }
>>>>
>>>> __attribute__((noinline, noclone))
>>>> int bar(void)
>>>> {
>>>>   return foo(f1);
>>>> }
>>>>
>>>> Shouldn't foo and bar be also marked as nothrow ?
>>>> Since foo indirectly calls f1 which is nothrow and bar only calls foo ?
>>>> The local-pure-const2 dump shows function is locally throwing  for
>>>> "foo" and "bar".
>>>>
>>>> Um, I was wondering how to get "points-to" analysis for function-pointers,
>>>> to get list of callees that may be indirectly called from that
>>>> function pointer ?
>>>> In the patch I just set node to bottom if it contains indirect calls
>>>> which is far from ideal :(
>>>> I would be much grateful for suggestions on how to handle indirect calls.
>>>> Thanks!
>>> ping https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02063.html
>> ping * 2 https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02063.html
> ping * 3 https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02063.html
ping * 4 https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02063.html

Thanks,
Prathamesh
>
> Thanks,
> Prathamesh
>>
>> Thanks,
>> Prathamesh
>>>
>>> Thanks,
>>> Prathamesh
>>>>
>>>> Regards,
>>>> Prathamesh
>>>>>
>>>>> Thanks,
>>>>> Prathamesh
>>>>>>
>>>>>> Honza

Reply via email to