On Friday 02 November 2001 01:33 am, Uri Guttman wrote: > > dan at his recent talk at boston.pm's tech meeting said he was leaning > towards a copying GC scheme. this would be the split ram in half design > and copy all objects to the other half at CG time. the old half is > reclaimed (not even reclaimed, just ignored!) in one big chunk.
Wonder how I missed this talk. The first thing that occurs to me is cache-starvation due to runing through the entire heap periodically. That's going to be ugly, especially in multi-CPU where most everything will wind up be flagged as "shared" or even "invalid". But I'd have to see more details to comment further. > maybe this could be integrated with the vmem system as well. instead of > just freeing all GC objects and letting the vmem system collect and > consolidate, have the GC do a copying collection so that te vmem system > would have only freshly allocated chunks (at the appropriate level, hard > to tell here) to manage. this is not a fully thought out idea but since > vmem will consolidate and free when it can, why not have the > consolidation driven by a copying GC? > uri vmem and the object caching layer are independent. vmem is primarily focused on efficiently carving up a resource space in a constant time (via allocated hash-tables, free-lists, and segment-spans) This is definately compatible with a copying GC, since the left and right regions would be considered separate segment spans. The GC and vmem would, however, have to be written together. The problem is that vmem wasn't designed for repeated access by multiple threads (as say GNU's malloc is). Hense the object-cache layer. The object cache layer provides arenas (SUN calls them slabs), which facilitate the remainder of perl's current requirements, but it's real contribution is in it's magazine, which I believe runs counter to this GC scheme. A magazine is designed so that the next allocation by a thread is the LAST (same sized) free from that same thread, which all but requires it to pre-exist within the low-level cache. More-over, you are almost garunteed that a free by one thread will not be acquired in an alloc from another thread (which would cause a "shared" or "invalid" cache state in multi-CPU systems). From this, temporary allocs / frees (of random sized memory or arena-based objects) can be very quick, since caching is addressed. Obviously SUNs focus is on scalable memory architectures (since they have 64+ CPU machines that do resource allocation in the OS) If a GC "never" free's, and instead periodically copy's valid memory to another segment, then there's no consideration of cache locality. EVERY allocation is garunteed to not be present within the cache (since all of memory will have been spanned before it's revisited in by an allocation). What does this mean for performance? I don't know, because I've never benchmarked GC's before. But I can speculate that for applications that do a lot of memory allocation, this GC scheme is going to be hurting, since all of (heap) memory must be traversed when memory is starved. In general, I'm not seeing a whole lot of benifit to using the vmem scheme with this style of GC, since the overhead seems to go to waste. Might as well do a stack-carving allocation scheme. Blindingly fast allocations garunteed so long as there's more memory, and there aren't any free's. Course any given app is going to use 4 times as much memory as other schemes ( 2 x what-ever extra is wasted). Interesting approach thought. Not to mention that unless a separate thread is required for parrot operation, there will be massive hickups when the GC is invoked to run through all memory (potentially several meg worth every time it doubles in size). Perhaps an adaptive approach can be used.. When heap memory is <= 1Meg this stack-carving technique is used (since copying a meg won't be noticable). When > 1Meg, more sophisticated methods with higher overhead but better memory efficiency are used. I'm not terribly excited about vmem (though it's kind of cool) if the whole SUN system isn't used together. And unless it can be found to be compatible with SOME GC for larger scale memory managers, I'm inclined to scrap it. Question though.. Are we leaning towards requiring a separate thread for GC as with java? -Michael