> > * LTO and memory management > This is a general question about LTO and memory management. > IIUC the following sequence takes place during normal LTO: > LGEN: generate_summary, write_summary > WPA: read_summary, execute ipa passes, write_opt_summary > > So I assumed it was OK in LGEN to allocate return_callees_map in > generate_summary and free it in write_summary and during WPA, allocate > return_callees_map in read_summary and free it after execute (since > write_opt_summary does not require return_callees_map). > > However with fat LTO, it seems the sequence changes for LGEN with > execute phase takes place after write_summary. However since > return_callees_map is freed in pure_const_write_summary and > propagate_malloc() accesses it in execute stage, it results in > segmentation fault. > > To work around this, I am using the following hack in > pure_const_write_summary: > // FIXME: Do not free if -ffat-lto-objects is enabled. > if (!global_options.x_flag_fat_lto_objects) > free_return_callees_map (); > Is there a better approach for handling this ?
I think most passes just do not free summaries with -flto. We probably want to fix it to make it possible to compile multiple units i.e. from plugin by adding release_summaries method... So I would say it is OK to do the same as others do and leak it with -flto. > diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c > index e457166ea39..724c26e03f6 100644 > --- a/gcc/ipa-pure-const.c > +++ b/gcc/ipa-pure-const.c > @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3. If not see > #include "tree-scalar-evolution.h" > #include "intl.h" > #include "opts.h" > +#include "ssa.h" > > /* Lattice values for const and pure functions. Everything starts out > being const, then may drop to pure and then neither depending on > @@ -69,6 +70,15 @@ enum pure_const_state_e > > const char *pure_const_names[3] = {"const", "pure", "neither"}; > > +enum malloc_state_e > +{ > + PURE_CONST_MALLOC_TOP, > + PURE_CONST_MALLOC, > + PURE_CONST_MALLOC_BOTTOM > +}; It took me a while to work out what PURE_CONST means here :) I would just call it something like STATE_MALLOC_TOP... or so. ipa_pure_const is outdated name from the time pass was doing only those two. > @@ -109,6 +121,10 @@ typedef struct funct_state_d * funct_state; > > static vec<funct_state> funct_state_vec; > > +/* A map from node to subset of callees. The subset contains those callees > + * whose return-value is returned by the node. */ > +static hash_map< cgraph_node *, vec<cgraph_node *>* > *return_callees_map; > + Hehe, a special case of return jump function. We ought to support those more generally. How do you keep it up to date over callgraph changes? > @@ -921,6 +1055,23 @@ end: > if (TREE_NOTHROW (decl)) > l->can_throw = false; > > + if (ipa) > + { > + vec<cgraph_node *> v = vNULL; > + l->malloc_state = PURE_CONST_MALLOC_BOTTOM; > + if (DECL_IS_MALLOC (decl)) > + l->malloc_state = PURE_CONST_MALLOC; > + else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), v)) > + { > + l->malloc_state = PURE_CONST_MALLOC_TOP; > + vec<cgraph_node *> *callees_p = new vec<cgraph_node *> (vNULL); > + for (unsigned i = 0; i < v.length (); ++i) > + callees_p->safe_push (v[i]); > + return_callees_map->put (fn, callees_p); > + } > + v.release (); > + } > + I would do non-ipa variant, too. I think most attributes can be detected that way as well. The patch generally makes sense to me. It would be nice to make it easier to write such a basic propagators across callgraph (perhaps adding a template doing the basic propagation logic). Also I think you need to solve the problem with keeping your summaries up to date across callgraph node removal and duplications. Honza