Given is a Perl snippet like:

  {
    my $fh = IO::File->new;
    $fh->open(">test.tmp");
    print $fh "a";
  }

The filehandle is closed automatically at scope exit and the file
contains the expected contents.

That's quite easy in the current Perl implementation as it does
reference counting. At the closing bracket the last reference to
C<$fh> has gone and the file gets closed during object destruction.

As parrot isn't using reference counting for GC, we have to run a GC
cycle to detect that the IO object is actually dead. This is
accomplished by the "sweep 0" opcode, which does nothing if no object
needs timely destruction.

Above example could roughly translate to Parrot code like:

  new_pad -1          # new lexical scope
  $P0 = new ParrotIO
  store_lex -1, "$fh", $P0
  open $P0, "test.tmp", ">"
  print $P0, "a"
  pop_pad             # get rid of $fh lexical
  sweep 0             # scope exit sequence

Alternatively a scope exithandler could be pushed onto the control
stack.

With such a sequence we got basically two problems:

1) Correctness of "sweep 0"

At scope exit we've somewhere in the PMC registers the ParrotIO
object still existing. This means that during marking the root set,
the ParrotIO object is found being alive, where it actually isn't.
The usage of registers keeps the object alive until this register
frame isn't referenced anymore, that is after that function was left.

2) Performance

The mark&sweep collector has basically a performance that is
proportional to the amount of all allocated objects (all live objects
are marked and live+dead objects are visited during sweep). So
imagine, the Perl example is preceded by:

  my @array;
  $array[$_] = $_ for (1..100000);
  {
     ...

The GC cycle at scope exit has now to run through all objects to
eventually find the ParrotIO object being dead (given 1) is solved).
Performance would really suck for all non-trivial programs, i.e. for
all programs with some considerable amount of live data.

During the discussion WRT continuations I've already proposed a scheme
that would solve 1) too.

While 2) is "just" a performance problem, addressing it early can't
harm as a solution might impact overall interpreter layout. But before
going more into that issue, I'd like to hear some other opinions.

leo




Reply via email to