On Friday 02 November 2001 01:33 am, Uri Guttman wrote:
>
> dan at his recent talk at boston.pm's tech meeting said he was leaning
> towards a copying GC scheme. this would be the split ram in half design
> and copy all objects to the other half at CG time. the old half is
> reclaimed (not even reclaimed, just ignored!) in one big chunk.

Wonder how I missed this talk.  The first thing that occurs to me is 
cache-starvation due to runing through the entire heap periodically.  That's 
going to be ugly, especially in multi-CPU where most everything will wind up 
be flagged as "shared" or even "invalid".  But I'd have to see more details 
to comment further.

> maybe this could be integrated with the vmem system as well. instead of
> just freeing all GC objects and letting the vmem system collect and
> consolidate, have the GC do a copying collection so that te vmem system
> would have only freshly allocated chunks (at the appropriate level, hard
> to tell here) to manage. this is not a fully thought out idea but since
> vmem will consolidate and free when it can, why not have the
> consolidation driven by a copying GC?
> uri

vmem and the object caching layer are independent.  vmem is primarily focused 
on efficiently carving up a resource space in a constant time (via allocated 
hash-tables, free-lists, and segment-spans)  This is definately compatible 
with a copying GC, since the left and right regions would be considered 
separate segment spans.  The GC and vmem would, however, have to be written 
together.  The problem is that vmem wasn't designed for repeated access by 
multiple threads (as say GNU's malloc is).  Hense the object-cache layer.

The object cache layer provides arenas (SUN calls them slabs), which 
facilitate the remainder of perl's current requirements, but it's real 
contribution is in it's magazine, which I believe runs counter to this GC 
scheme.  A magazine is designed so that the next allocation by a thread is 
the LAST (same sized) free from that same thread, which all but requires it 
to pre-exist within the low-level cache.  More-over, you are almost garunteed 
that a free by one thread will not be acquired in an alloc from another 
thread (which would cause a "shared" or "invalid" cache state in multi-CPU 
systems).  From this, temporary allocs / frees (of random sized memory or 
arena-based objects) can be very quick, since caching is addressed.  
Obviously SUNs focus is on scalable memory architectures (since they have 64+ 
CPU machines that do resource allocation in the OS)

If a GC "never" free's, and instead periodically copy's valid memory to 
another segment, then there's no consideration of cache locality.  
EVERY allocation is garunteed to not be present within the cache (since all 
of memory will have been spanned before it's revisited in by an allocation).

What does this mean for performance?  I don't know, because I've never 
benchmarked GC's before.  But I can speculate that for applications that do a 
lot of memory allocation, this GC scheme is going to be hurting, since all of 
(heap) memory must be traversed when memory is starved.

In general, I'm not seeing a whole lot of benifit to using the vmem scheme 
with this style of GC, since the overhead seems to go to waste.  Might as 
well do a stack-carving allocation scheme.  Blindingly fast allocations 
garunteed so long as there's more memory, and there aren't any free's.  
Course any given app is going to use 4 times as much memory as other schemes 
( 2 x what-ever extra is wasted).  Interesting approach thought.  Not to 
mention that unless a separate thread is required for parrot operation, there 
will be massive hickups when the GC is invoked to run through all memory 
(potentially several meg worth every time it doubles in size).

Perhaps an adaptive approach can be used.. When heap memory is <= 1Meg this 
stack-carving technique is used (since copying a meg won't be noticable).  
When > 1Meg, more sophisticated methods with higher overhead but better 
memory efficiency are used.  I'm not terribly excited about vmem (though it's 
kind of cool) if the whole SUN system isn't used together.  And unless it can 
be found to be compatible with SOME GC for larger scale memory managers, I'm 
inclined to scrap it.

Question though.. Are we leaning towards requiring a separate thread for GC 
as with java?

-Michael

Reply via email to