Leopold Toetsch wrote: > > On Jul 16, 2005, at 2:24, Nattfodd wrote: > >> >> Hi, >> I've produced a new document on GMC (Generational Mark & Compact), the >> GC I'm trying to implement as a Summer of Code project. It's called gmc >> for dummies and I hope it's plainly understandable (if not, tell me so >> and I'll try to make it better). It should explain how things will >> hopefully work. >> >> It's here : >> http://perso.ens-lyon.fr/alexandre.buisse/divers/gmc_for_dummies.pod > > > This is not quite the scheme, we were pondering in previous mail and IRC. > > 1) pmc_bodies have to be variable sized
Oh, I believed that we would use variable-sized pmc only if the gc proved to work really well. > 2) the scheme is missing the invariant we keep: > C<< &younger_body < &older_body >> > > ad 1) > > a PMC is a 2 word structure { PMC_BODY *pmc_body; VTABLE *vtable } > a pmc_body of Integer PMC is eventually just { INTVAL int_val }, of a > Float PMC it's the double num_val, > a pmc_body of an object with 2 attributes is: > { INTVAL n_attr; PMC *class; PMC *attr1; PMC *attr2 } > and a pmc_body of the Hash PMC is basically just the Hash structure > w/o any further indirections... > > During the transition phase all pmc_bodies still have an 'UINTVAL > flags' field. > > The gmc_header that is prepended to all pmc_bodies has a PMC *back_ptr > and gc flags + some type flags that identify the following PMC body > for gc operations. We need the size of the body for walking the > pmc_body memory and we have to mark the contents of the body, if any. Then we use a doubly linked list, but that should be no problem. The structure becomes something like : ... | pmc_body || back_ptr | flags | prev | next | pmc_body || back_ptr | ... ------------------------------------------------ ---------------- | | v v pmc_header pmc_body pmc_body size is just (next - *(void**)next + sizeof(pmc_header)). > > ad 2) > > The free generation is lowest in memory followed by the youngest > generation, the oldest objects have highest memory addresses. Please > note: this invariant exists in the pmc_body area. Therefore the mark > GC operation is one pass: > > - mark the root set > - walk the pmc_bodies from left (lowest mem) to right > - if the pmc_body is alive mark it's contents (if any) and proceed > - if the pmc_body isn't marked alive, it's dead > > There is no need for any extra todo-list to mark the children of objects. ??? I believe that was pretty much what was described in the document. What I called "pointers an object has to other objects" is simply its contents. > > We keep the invariant by several means: > a) aggregates/containers are allocated from the left of free > b) non-aggregates are allocated from the right of free > (with a) + b) containers already have their contents at higher > memory addresses) So you mean that we allocate : | xxx[aggregates]xxx... ...[free space]... ...xxx[non-aggregates]xxx | Or rather, as I thought : | ...[free space]... ...xxx[some aggregate][its contents][some aggregate][its contents]xxx | > > c) a write barrier checks pointer stores into aggregates (by just > comparing 2 memory addresses - basically) Another question I had is how do we know when there is such a pointer store ? It can happen at other moments than object allocation, right ? > we can do either: > - make aggregate younger (move to left) > - make stored-into object older (move to right) > - remember in IGP > d) if we have to extend the allocation area, we usually we'll have to > move all existing objects "up", either by a full GC run or by just > copying, because you'll get usually higher memory regions from the OS. Hum, I hadn't thought of this problem.