On Mon, 2013-08-05 at 06:59 -1000, Richard Henderson wrote: > On 08/05/2013 05:18 AM, David Malcolm wrote: > > So I *think* the most efficient traversal is to do this first (with a > > suitable comment): > > > > for (int i = passes_by_id_size ; i > 0; ) > > ::gt_ggc_mx (passes_by_id[--i]); > > > > That ought to visit all of the passes without triggering recursion > > (unless someone does something weird to the pass structure in a plugin). > > Reasonable. > > > I'm nervous about omitting the traversal for other pointers the class > > has (though I *think* the passes by id array captures it all) so for > > completeness I'd prefer to then to also do it for all of: > > > > ::gt_ggc_mx (all_passes); > > ::gt_ggc_mx (all_small_ipa_passes); > > ::gt_ggc_mx (all_lowering_passes); > > ::gt_ggc_mx (all_regular_ipa_passes); > > ::gt_ggc_mx (all_lto_gen_passes); > > ::gt_ggc_mx (all_late_ipa_passes); > > > > #define INSERT_PASSES_AFTER(PASS) > > #define PUSH_INSERT_PASSES_WITHIN(PASS) > > #define POP_INSERT_PASSES() > > #define NEXT_PASS(PASS, NUM) ::gt_ggc_mx (PASS ## _ ## NUM); > > #define TERMINATE_PASS_LIST() > > > > #include "pass-instances.def" > > > > #undef INSERT_PASSES_AFTER > > #undef PUSH_INSERT_PASSES_WITHIN > > #undef POP_INSERT_PASSES > > #undef NEXT_PASS > > #undef TERMINATE_PASS_LIST > > > > Is it standard in handwritten GC code to omit traversals based on this > > higher-level knowledge? Presumably it warrants a source comment? > > (having spent too much time lately debugging PCH issues I'm nervous > > about trying to optimize this). > > It's something that we should think about when doing any kind of GC. > The deep recursion has bitten us before, and lead to the chain_next > and chain_prev annotations for GTY.
Note to self: the chain_next and chain_prev options are documented at: http://gcc.gnu.org/onlinedocs/gccint/GTY-Options.html BTW, this issue seems very reminiscent to me of an implementation detail within the CPython interpreter called the "trashcan". CPython uses reference-counting, and if you have a deep chain of references e.g.: # make a very deep chain of list-of-lists: foo = [] for i in range(depth): foo = [foo] # at this point we have # foo = [[[[[[[[[[[[[[[[[[[[[[[[[[[[ \ # ]]]]]]]]]]]]]]]]]]]]]]]]]]]] for some depth # try to overflow the stack during deallocation: del foo ...you would have a deeply nested chain of decrefs, each triggering deleting of the object, and hence a potential stack overflow. The way CPython avoids such deep reference chains from killing the process with a stack overflow is a pair of macros used in the deallocator for container types, which track the depth of the traversal, and when it reaches a certain threshold, start appending objects to a singly-linked list of deferred deallocations (using a spare pointer in the objects that's safe to use during finalization). Hence the recursion becomes an iteration when a certain stack depth is reached, and diabolical reference graphs can't blow the stack (modulo bugs in 3rd-party container code, of course). The equivalent for GC would be for mx routines to change from: if (ggc_test_and_set_mark (p)) gt_ggc_mx (p); to: if (ggc_test_and_set_mark (p)) add_to_worklist (p, marking_cb); I suspect that this technique can't be used in GGC since it would require (a) having a spare pointer per-object and (b) tracking the type of the marking callback, which would be a jump through a function ptr where there wasn't one before, and thus unacceptable in the general case. > > AIUI we could do similar optimizations in the PCH object-noting hook, > > but can't do similar optimizations in the PCH *op* hook, since every > > field needs to reconstructed when reading the data back from disk. > > Correct. I'll update the patch with the optimizations you suggest (and the reverse-order marking I mentioned above). Thanks Dave