On Jul 16, 2005, at 2:24, Nattfodd wrote:
Hi,
I've produced a new document on GMC (Generational Mark & Compact), the
GC I'm trying to implement as a Summer of Code project. It's called gmc
for dummies and I hope it's plainly understandable (if not, tell me so
and I'll try to make it better). It should explain how things will
hopefully work.
It's here :
http://perso.ens-lyon.fr/alexandre.buisse/divers/gmc_for_dummies.pod
This is not quite the scheme, we were pondering in previous mail and
IRC.
1) pmc_bodies have to be variable sized
2) the scheme is missing the invariant we keep:
C<< &younger_body < &older_body >>
ad 1)
a PMC is a 2 word structure { PMC_BODY *pmc_body; VTABLE *vtable }
a pmc_body of Integer PMC is eventually just { INTVAL int_val }, of a
Float PMC it's the double num_val,
a pmc_body of an object with 2 attributes is:
{ INTVAL n_attr; PMC *class; PMC *attr1; PMC *attr2 }
and a pmc_body of the Hash PMC is basically just the Hash structure w/o
any further indirections...
During the transition phase all pmc_bodies still have an 'UINTVAL
flags' field.
The gmc_header that is prepended to all pmc_bodies has a PMC *back_ptr
and gc flags + some type flags that identify the following PMC body for
gc operations. We need the size of the body for walking the pmc_body
memory and we have to mark the contents of the body, if any.
ad 2)
The free generation is lowest in memory followed by the youngest
generation, the oldest objects have highest memory addresses. Please
note: this invariant exists in the pmc_body area. Therefore the mark GC
operation is one pass:
- mark the root set
- walk the pmc_bodies from left (lowest mem) to right
- if the pmc_body is alive mark it's contents (if any) and proceed
- if the pmc_body isn't marked alive, it's dead
There is no need for any extra todo-list to mark the children of
objects.
We keep the invariant by several means:
a) aggregates/containers are allocated from the left of free
b) non-aggregates are allocated from the right of free
(with a) + b) containers already have their contents at higher
memory addresses)
c) a write barrier checks pointer stores into aggregates (by just
comparing 2 memory addresses - basically)
we can do either:
- make aggregate younger (move to left)
- make stored-into object older (move to right)
- remember in IGP
d) if we have to extend the allocation area, we usually we'll have to
move all existing objects "up", either by a full GC run or by just
copying, because you'll get usually higher memory regions from the OS.
These are of course just my ideas, how GMC could operate.
leo