On Mon, 2013-08-05 at 06:59 -1000, Richard Henderson wrote:
> On 08/05/2013 05:18 AM, David Malcolm wrote:
> > So I *think* the most efficient traversal is to do this first (with a
> > suitable comment):
> > 
> >  for (int i = passes_by_id_size ; i > 0; )
> >    ::gt_ggc_mx (passes_by_id[--i]);
> > 
> > That ought to visit all of the passes without triggering recursion
> > (unless someone does something weird to the pass structure in a plugin).
> 
> Reasonable.
> 
> > I'm nervous about omitting the traversal for other pointers the class
> > has (though I *think* the passes by id array captures it all) so for
> > completeness I'd prefer to then to also do it for all of:
> > 
> >  ::gt_ggc_mx (all_passes);
> >  ::gt_ggc_mx (all_small_ipa_passes);
> >  ::gt_ggc_mx (all_lowering_passes);
> >  ::gt_ggc_mx (all_regular_ipa_passes);
> >  ::gt_ggc_mx (all_lto_gen_passes);
> >  ::gt_ggc_mx (all_late_ipa_passes);
> > 
> > #define INSERT_PASSES_AFTER(PASS)
> > #define PUSH_INSERT_PASSES_WITHIN(PASS)
> > #define POP_INSERT_PASSES()
> > #define NEXT_PASS(PASS, NUM) ::gt_ggc_mx (PASS ## _ ## NUM);
> > #define TERMINATE_PASS_LIST()
> > 
> > #include "pass-instances.def"
> > 
> > #undef INSERT_PASSES_AFTER
> > #undef PUSH_INSERT_PASSES_WITHIN
> > #undef POP_INSERT_PASSES
> > #undef NEXT_PASS
> > #undef TERMINATE_PASS_LIST
> > 
> > Is it standard in handwritten GC code to omit traversals based on this
> > higher-level knowledge?  Presumably it warrants a source comment?
> > (having spent too much time lately debugging PCH issues I'm nervous
> > about trying to optimize this).
> 
> It's something that we should think about when doing any kind of GC.
> The deep recursion has bitten us before, and lead to the chain_next
> and chain_prev annotations for GTY.

Note to self: the chain_next and chain_prev options are documented at:
http://gcc.gnu.org/onlinedocs/gccint/GTY-Options.html

BTW, this issue seems very reminiscent to me of an implementation detail
within the CPython interpreter called the "trashcan".  CPython uses
reference-counting, and if you have a deep chain of references e.g.:
  # make a very deep chain of list-of-lists:
  foo = []
  for i in range(depth):
     foo = [foo]
  # at this point we have
  # foo = [[[[[[[[[[[[[[[[[[[[[[[[[[[[ \
  #       ]]]]]]]]]]]]]]]]]]]]]]]]]]]] for some depth

  # try to overflow the stack during deallocation:
  del foo

...you would have a deeply nested chain of decrefs, each triggering
deleting of the object, and hence a potential stack overflow.

The way CPython avoids such deep reference chains from killing the
process with a stack overflow is a pair of macros used in the
deallocator for container types, which track the depth of the traversal,
and when it reaches a certain threshold, start appending objects to a
singly-linked list of deferred deallocations (using a spare pointer in
the objects that's safe to use during finalization).  Hence the
recursion becomes an iteration when a certain stack depth is reached,
and diabolical reference graphs can't blow the stack (modulo bugs in
3rd-party container code, of course).

The equivalent for GC would be for mx routines to change from:

  if (ggc_test_and_set_mark (p))
    gt_ggc_mx (p);

to:

  if (ggc_test_and_set_mark (p))
    add_to_worklist (p, marking_cb);

I suspect that this technique can't be used in GGC since it would
require (a) having a spare pointer per-object and (b) tracking the type
of the marking callback, which would be a jump through a function ptr
where there wasn't one before, and thus unacceptable in the general
case.

> > AIUI we could do similar optimizations in the PCH object-noting hook,
> > but can't do similar optimizations in the PCH *op* hook, since every
> > field needs to reconstructed when reading the data back from disk.
> 
> Correct.

I'll update the patch with the optimizations you suggest (and the
reverse-order marking I mentioned above).

Thanks
Dave

Reply via email to