Bart Lateur wrote:
> On Fri, 09 Feb 2001 12:06:12 -0500, Ken Fox wrote:
> > 1. Cheap allocations. Most fast collectors have a one or two
> >    instruction malloc. In C it looks like this:
> >
> >      void *malloc(size) { void *obj = heap; heap += size; return obj; }
> > ...
> 
> That is not a garbage collector.

I said it was an allocator not a garbage collector. An advanced
garbage collector just makes very simple/fast allocators possible.

> That is "drop everything you don't need, and we'll never use it
> again." Oh, sure, not doing garbage collection at all is faster then
> doing reference counting.

You don't have a clue. The allocator I posted is a very common allocator
used with copying garbage collectors. This is *not* a "pool" allocator
like Apache uses. What happens is when the heap fills up (probably on a
seg fault triggered by using an obj outside the current address space),
the collector is triggered. It traverses live data and copies it into a
new space (in a simple copying collector these are called "from" and "to"
spaces). Generational collectors often work similarly, but they have
more than two spaces and special rules for references between spaces.

> > 2. Work proportional to live data, not total data. This is hard to
> >    believe for a C programmer, but good garbage collectors don't have
> >    to "free" every allocation -- they just have to preserve the live,
> >    or reachable, data. Some researchers have estimated that 90% or
> >    more of all allocated data dies (becomes unreachable) before the
> >    next collection. A ref count system has to work on every object,
> >    but smarter collectors only work on 10% of the objects.
> 
> That may work for C, but not for Perl.

Um, no. It works pretty well for Lisp, ML, Prolog, etc. I'm positive
that it would work fine for Perl too.

>         sub test {
>             my($foo, $bar, %baz);
>             ...
>             return \%baz;
>         }
> 
> You may notice that only PART of the locally malloced memory, gets
> freed. the memory of %baz may well be in the middle of that pool. You're
> making a huge mistake if you simply declare the whole block dead weight.

You don't understand how collectors work. You can't think about individual
allocations anymore -- that's a fundamental and severe restriction on
malloc(). What happens is that the garbage accumulates until a collection
happens. When the collection happens, live data is saved and the garbage
over-written.

In your example above, the memory for $foo and $bar is not reclaimed
until a collection occurs. %baz is live data and will be saved when
the collection occurs (often done by copying it to a new heap space).
Yes, this means it is *totally* unsafe to hold pointers to objects in
places the garbage collector doesn't know about. It also means that
memory working-set sizes may be larger than with a malloc-style system.

There are lots of advantages though -- re-read my previous note.

The one big down-side to non-ref count GC is that finalization is
delayed until collection -- which may be relatively infrequently when
there's lots of memory. Data flow analysis can allow us to trigger
finalizers earlier, but that's a lot harder than just watching a ref
count.

- Ken

Reply via email to