Re: Constant STRINGs, PMCs, Garbage Collection, and PBC

chromatic Sun, 24 Feb 2008 19:18:06 -0800

On Sunday 24 February 2008 18:41:23 Bob Rogers wrote:

> Granted, and it's tough to make a PMC truly read-only until after it's
> completely initialized . . .
>
>    There's a similar problem for accessors and setters.  Again, that's
>    solveable with more code or more cleverness.
>
> So, you're saying it is legal to invoke a setter on a constant PObj?  In
> what sense then is that PObj a constant?


That depends.  Is a hash declared with:

        .const Hash foo

... constant as a reference or a referent or an aggregate or some combination 
of all three?

interp->iglobals is a pointer to a PMC array that holds the root set.  That 
could be a constant, in that the array should never get collected until the 
interpreter shuts down.  However, the contents of its contents may change.

Is it constant?  Should it be?

>    However, some of those PObjs might have active references pointing to
> them from elsewhere, so silently upgrading them to constant PObjs (that is,
> allocating a new constant header and then copying everything to that new
> header) requires fixing up all of the old references.
>
>    That's fixable, but I usually have to go lie down after thinking
>    about it.
>
> Do you really need to fix up old references?  After all, if the data
> structure truly is constant, other code should have no way of telling
> the difference.

That depends on the use of the PObj.  If we can get away with just a copy, 
then we don't have to fixup old references in place.  If the data is simple 
and immutable, then we're probably safe.

>    BTW, that is the strategy used by most Lisp systems (and they don't
> bother fixing up the old objects).  Typically, there is an internal
> "purify" function that makes its argument pure (and therefore read-only)
> by copying it recursively into pure space.  This is normally done to
> constants when loading compiled code, and also happens to selected data
> structures before dumping a new Lisp image.  In these cases, the fact
> that the old objects are still non-constant doesn't matter, as they are
> usually immediately GCed.  The chief purpose of this, though, is to take
> long-lived constant data structures out of the way of GC.
>
>    And, as I'm sure you've noticed, this also gets around the "can't
> initialize a constant PMC" problem, without requiring a separate API for
> constant creation.

That seems like a better approach.  At the point we know we want to take a 
PObj out of the possibility of GC, we can decide then and there.  That might 
be the right approach altogether -- I like how it separates the 
responsibility for purifying things out of object construction.

> So I gather that the purpose of read-only PMCs in the current design is
> something other than GC efficiency.

I think it *is* for GC efficiency, but like a lot of pieces of Parrot over the 
years, it's a preliminary implementation based on a preliminary design and 
needs some refactoring toward a fuller design.

> Which raises a few questions: 
>
>    1.  What *is* the purpose of constant PMCs?  The only mention of "the
> constant PMC pool" in PDD17 is part of the explanation of the
> "singleton" modifier to "pmclass" -- though the bare reference doesn't
> explain much.
>
>    2.  What would be the cost of storing constant PMCs in the normal
> pool, and having GC treat them normally?  From what you say, speed is
> probably not an issue.  And any non-GC purposes of constant PMCs ought
> not to require special GC treatment.  On the other hand, the benefit may
> be to squash a whole nasty class of GC bugs.

 * O(n^2) gets a bigger n as we mark and sweep more PObjs
 * we have more work to do to *find* the PObjs we need to mark, as some of 
them are in bytecode segments and others are elsewhere

Only the latter concerns me at all.  Mark and sweep isn't the world's most 
performant GC system anyway, and our implementation has some naive spots 
itself (walking entire arenas to clear live flags and check custom destroy 
flags is one).

Allison's GC spec uses in incremental mark scheme with tri-coloring.  I'm sort 
of hopeful we can get at least to a copying scheme for the sweep.

-- c

Re: Constant STRINGs, PMCs, Garbage Collection, and PBC

Reply via email to