On Tue, Dec 05, 2006 at 09:06:13PM +0100, I (Basile) wrote in http://gcc.gnu.org/ml/gcc/2006-12/msg00158.html
> > I want to have a GTY() garbage collected structure such that, when it > is destoyed, some specific routine is called (this should indeed be > possible, since GGC is a mark& sweep garbage collector, which delete > individually each dead data). Laurynas Biveinis explained in http://gcc.gnu.org/ml/gcc/2006-12/msg00165.html > "Deletable" just sets the pointer to NULL on garbage collection, in > practice making it a weak pointer. Daniel Berlin commented in http://gcc.gnu.org/ml/gcc/2006-12/msg00174.html > We don't have support for user-specifiable destruction routines, one > reason being that some day, in a galaxy far far away, we will have > something better of a garbage collector, or not use garbage collection I'm not sure to understand what Daniel suggests. If he dreams of a better memory handling than the current GGC, I certainly agree; I actually dream of a GCC future compiler where every data is garbage collected in a copying generational scheme (see my Qish experiment). This would require some preprocessor or even perhaps some language support. So I realize that it is currently inpractical. I won't discuss details now, but suggest diving into Jones & Lins book on garbage collection), but I still call such futuristic memory handling garbage collection. If Daniel means that the very idea of garbage collection in a compiler is bad, and that every object should be manually allocated & explicitly freed (à la malloc & free or like C++ new/delete, I respectfully disagree with him. (BTW I must admit here that I have some Ocaml experience). Zack Weinberg wrote in http://gcc.gnu.org/ml/gcc/2006-12/msg00159.html > We definitely don't have the feature you want now, and I would be > very hesitant to try to add it - the existing sweep phase is quite > lazy, and I'd really prefer not to do anything that made it harder > to switch to a more efficient collector algorithm. > On the other hand, I sympathize with your goal; I've been idly > thinking a little myself about the sensibility of using MPFR > throughout, instead of our bespoke REAL_VALUE_TYPE thing. [I don't > know if this is actually a good idea yet.] I presume that Zack refers to some comment in gcc/fold-const.c (rev 119546 of trunk) where I read /*@@ This file should be rewritten to use an arbitrary precision @@ representation for "struct tree_int_cst" and "struct tree_real_cst". My understanding is that constant folding is currently done in ad-hoc (two-words) arithmetic, and that the trend is to go to arbitrary precision arithmetic using MPFR & GMP (which seems to be needed not only for Fortran). Since the constants are inside Gimple-like trees (even if you represent them by tuples), I am expecting that they are garbage collected, so need to be freed. > So my question to you is, what do those destruction routines do, and > is are they actually a necessary thing if the memory has been > allocated by GGC rather than library-internal calls to malloc()? If the libraries we are using (today MPFR & GMP, and tomorrow, on my side, probably PPL -using only its C API interface- -- I am interested in time-consuming static analysis) do not offer internal memory hooks but offer only allocate & delete (or clear) routines, then I still believe that many of us will take advantage of GTY-structure which have a destructor. My meaning of destructor is: a routine which is called by the (mark&sweep) garbage collector in the sweep phase, just before freeing the dead object. I do not imply any C++ sense of the word destructor. More precisely I propose the following to add support for destroyable objects with the GGC collector. add into ggc.h the following /* typedef for ggc destructors, called by the GC just before destroying a dead object; the destructor is not expected to make any GGC allocation or free */ /* Explicitly destroy some internal state in a GGC-ed pointer. Few objects are allocated with such destructor. */ typedef void (*gt_pointer_destroy) (void *); /* allocate an object with an explicit destructor; their size should be suitably small (eg less than 250*sizeof(void*)) because the bulk of their content is elsewhere */ extern void *ggc_allocate_destroyable (size_t MEM_STAT_DECL, gt_pointer_destroy destr); #define ggc_alloc_destr(s,d) ggc_allocate_destroyable (s MEM_STAT_INFO, d) I intend to handle any destroyable object as a structure containing the destructor routine, its mark (*), some suitable gap, and the object content proper. Pointers to object are (as always) into the content, but finding its destructor is easy. Of course, there is some overhead for each destroyable object (perhaps 2 or 4 machine words), but I expect them to be uncommon, and rather small (since the bulk of their content is elsewhere). Note (*) I don't know yet where their mark should be In ggc-page.c I think of coding the following: Keep all destroyable objects in a separate bag of pages and handle their allocation appropriately there. In ggc_collect before the call topoison_pages add a call which calls the destructor of every unmarked destroyable object. etc... In ggc-zone.c I think of coding the following manage a special alloc_zone destroy_zone for destroyable objects and have a struct destr_page_entry hack sweep_page to handle these destr_page_entry etc... I don't have all the details right now, but I am asking for your comments. I do depend that current GC implementations are mark & sweep variety (but Boehm's GC can be used with similar destructors). I believe that: 1. adding some support for destroyable objects is not only of interest to me, but to others (eg those wanting to add arbitrary precision constants in garbage collected trees). 2. given that destroyable objects are rare (w.r.t. others) performance should not suffer much 3. the commodity of such destroyable objects could be useful in the future 4. for the future, having every dynamically allocated object containing a common descriptive pointer or prefix (like in the Glib of GTK, or the vtable in C++, or implementation of values in Ocaml) would permit more freedom in various GC implementations, and in addition is valuable for debugging and others. (IMHO the tuple effort is going towards this, but I am not sure if it defines a principle that every GCC data should start with a common prefix; in object oriented parlance, I would prefer a single tree of inheritence, not a forest of independent hierarchies). The last point 4 is not needed to me now, but it could help, and I see a lot of other reasons why it could be useful. (e.g. debugging, dumping, ...) So please be kind to comment. Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359 8, rue de la Faïencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***