I was reading the archives of perl6-internals, searching for
information on the vtable stuff, and I found it's very fragmented.
So I decided to read all the information from there, and I tried
to write a kind of proposal for vtable implementation from the
info there. I hope it can be useful and it serves as base to
further discussion on the vtable thing.
I'll try to avoid much implementation-specific details, i.e.
I won't be neither C or C++ specific, so it can be implemented
in either. Also, don't mind the names I used. I'm only
suggesting them. Please try to see what's missing and what's
wrong so that we can have a good design of the vtable stuff.
I don't know if there's already a document speaking of vtables,
at least not that I could find...
Well, the vtable will replace perl5's tie and XS's sv_magic
functionality. I guess, it will also probably replace all the sv_,
av_, hv_, ... class of functions in XS. sv_setsv, for example,
will be accessible by vtable->ASSIGN, and SvIV will be accessible
by vtable->NUMBER (or something like it).
There will be various vtable classes, one for SVs (vtable_sv), one
for AVs (vtable_av), and so on. Each vtable will have operations that
make sense for each type of variable. For example, vtable_sv will have
a STRING and a NUMBER operation, and vtable_av will have a SIZE/LENGTH
and a SPLICE operations. Each operation will be a C/C++/native
language function that matches the prototype needed for the operation
(for example, SIZE/LENGTH takes no argument and returns a integer/SV
value; SPLICE takes 2 integers/SVs [start/end] and a list of
replacements/AV, and returns a list of elements that were there/AV).
A variable (or temporary variable, to hold temporary values), be it
SV, AV or HV or whatever, will be made of 3 words: a vtable pointer, a
pointer to the actual data, and an opaque value used by the garbage
collector (think of it as a refcount). In C struct:
struct sv {
vtable_sv * ptr_to_vtable;
void * ptr_to_data;
void * gc_data;
};
(this is actually much better in RFC 35, so please refer to that).
Many SVs/AVs/... can point to the same vtable, the vtable only has the
function pointers, it holds no state about the variable. The state of
the variable must be entirely hold in the ptr_to_data field.
Perl6's core (all of it but sv.c and sv.h) will access variables only
through the vtable, and will be prohibited to access anything inside
ptr_to_data. The same way, gc_data will be only accessed by the
garbage collector (gc.c/gc.h???). ANYTHING perl's core must be able to
do MUST be in an entry of the vtable. Of course one entry can be used
to implement more than one thing. Example: the entry SPLICE of the
vtable_av, that I described before, and corresponds to perl's splice
function, can be used internally to implement both splice, push, pop,
shift & unshift functions. Of course the vtable could have specialized
PUSH, POP, ... entries, but that would put a burden on anyone that
needs to create a new vtable, having to re-implement these entries. If
all of them are implemented over only one vtable entry, their
behaviour will be always determined by this one entry, on the other
side.
What is in the vtables?
Well, I could think of 4 classes of vtables, but I'm sure there are
more. I wrote down below some things that I think should be handled by
vtables but I'm not sure how. Here I present what I assumed are in the
vtables I thought. The 4 classes are: vtable_sv (for SVs), vtable_av
(for AVs), vtable_hv (for HVs), and vtable_fh (for filehandles). I
took a minimalistic approach, which is probably incomplete, so that we
can discuss further details. I used a kind of XS syntax to specify
parameters for each operation.
vtable_sv {
// assignment
void ASSIGN(SV *); // $this = $o
/* wouldn't it be
CREATE, BUILD, REBUILD, STORE, FETCH, DESTROY ???
I really don't know!
*/
// convert value
char *STRING(); // "$this"
int NUMBER(); // 0+$this
double DOUBLE(); // 0.0+$this
int BOOLEAN(); // $this ? $a : $b
// number operations
SV *PLUS(SV *); // $this + $o
SV *MINUS(SV *); // $this - $o
SV *TIMES(SV *); // $this * $o
SV *DIVIDE(SV *); // $this / $o
SV *MODULUS(SV *); // $this % $o
SV *SHL(SV *); // $this << $o
SV *SHR(SV *); // $this >> $o
SV *EXP(SV *); // $this ** $o
SV *NEG(); // -$this
// scalar ops (numeric/string)
SV *BITAND(SV *); // $this & $o
SV *BITOR(SV *); // $this | $o
SV *BITXOR(SV *); // $this ^ $o
SV *BITNOT(); // ~$this
// string operations
SV *CONCAT(SV *); // $this . $o
SV *REPEAT(SV *); // $this x $o
// comparisons
int NUMLT(SV *); // $this < $o
... the same for ==, !=, <, <=, >, >=, <=>
int STRLT(SV *); // $this lt $o
... the same for eq, ne, lt, le, gt, ge, cmp
};
I guess that's it for the basic version. Note I didn't include logic
AND/OR operations. I believe they don't belong here, because if they
did, it would be necessary to evaluate the second argument before
applying the OR/AND, and then it wouldn't be possible to short-circuit
it. I think the ideal is to use the BOOLEAN entry to see what the value
is, and depending on it being true or false and the operation being
AND or OR, returning its value.
($a || $b)
would be translated to
SV *sv_a = sv_get("a");
if (sv_a->BOOL()) {
return sv_a;
} else {
// evaluate later: short-circuit.
SV *sv_b = sv_get("b");
return sv_b;
}
I also left out every assigning operator but ASSIGN. I'm assuming +=,
-=, *=, |=, &=, etc., -- and ++ will be implemented based on a
combination of ASSIGN and PLUS, MINUS, TIMES, OR, AND, etc. Of course,
that could lead to wrong results, but I leave it for you to analyse
it.
I'm also skipping the discussion on `how to assign one sv from another'.
Will the assigned sv receive the vtable of the other one? Not? Will it
depend on the vtables of the svs? Too complex!!! Need help on that.
I also neglected references. I _think_ there should be entries like
SV *DEREF_SCALAR();
AV *DEREF_ARRAY();
HV *DEREF_HASH();
so that one reference can contain both a array and a hash, probably
with magic vtables. This could be very useful in accessing a
database's table row, accessing the fields by number or by name. There
was a recent discussion about this in comp.lang.perl.moderated. It
seems there is the possibility to 'use overload' and get this result
in perl5, having both $x->[0] and $x->{name} to work.
Well, let's move on to vtable_av:
vtable_av {
SV *FETCH(int index);
void STORE(int index, SV *value);
SV *DELETE(int index);
int FETCHSIZE();
void STORESIZE(int size);
AV *SLICE(int start, int end);
AV *SPLICE(int start, int end, AV *replace);
int EXISTS(int index);
};
This was taken basically from perltie manpage. perltie says it's
necessary to implement PUSH, POP, UNSHIFT and SHIFT, but I'm actually
with the `use SPLICE for everything' approach. I accept
counter-arguments. TIEARRAY isn't necessary because it's operation
corresponds to setting the vtable pointer of the AV itself. (DESTROY???)
The next one also comes from perltie.
vtable_hv {
SV *FETCH(SV *key);
void STORE(SV *key, SV *value);
SV *DELETE(SV *key);
void CLEAR();
int EXISTS(SV *key);
??? FIRSTKEY
??? NEXTKEY
};
FIRSTKEY/NEXTKEY wouldn't be better handled by external iterators,
with their own vtables??? A la Java Enumeration / STL ???
Well, the one about filehandles I would not touch right now because I
think it depends very much on the actual implementation of IO
subsystem. If you have something on it, send it. Basically, it should
have:
vtable_fh {
void PRINT(SV *);
SV *GETLINE();
AV *GETLINES();
// ...
};
I don't think I can get further from here. Note that, in all examples,
I didn't write the `this' pointer that every function would receive.
This would correspond to the `ptr_to_data' from the struct sv.
Some things I couldn't figure at all:
* References: how to implement them, allow more than one referenced
thingy, blessed objects, filehandles (now they're scalars, right?)
* Objects: RFC 92 proposes different search for a method depending on
$ISA::Search, or something like that. Could it somehow be handled by
vtables? I guess, but I couldn't figure it out at all.
* What kind of object is created with qr/abc/ ? What's its vtable (if
it has one)?
* Opaque strings: how will we request a different encoding?
Some remarks:
A la tie, we could have some packages Vtable::SV, Vtable::AV, ... that
serve as base for defining a Vtable in perl. There would be then a
`proxy' implementation in C/C++ that would only translate arguments as
needed to perl objects, call the perl correspondent sub with the same
name, and translate the return value back to C/C++ value. This would
supply both perl5's `tie' mechanism (mainly for arrays/hashes, but
scalars too) and perl5's `use overload', since defining PLUS,
MINUS, TIMES, ... in vtable_sv is the same as overloading '+', '-',
'*', ... in the object package.
Also, I feel that vtables cut much of the complexity involved in XS.
For me, at least, the scariest thing about XS is the lot of sv_xyz
calls that I must do only to access a simple value. Having reliable
vtables and being able to call them directly makes things much easier.
Also, names like STORE, FETCH, ASSIGN, PLUS, CONCAT, ... are much
easier to spot in code than very inconsistent things like sv_setiv,
SvIV, sv_catpvn, SVt_PVAV, and so on (damn XS!!!).
As to the implementation details, I think C is the way to go!
vtable_av would be
typedef struct {
SV *(*FETCH)(AV *this, int index);
void (*STORE)(AV *this, int index, SV *value);
SV *(*DELETE)(AV *this, int index);
int (*FETCHSIZE)(AV *this);
void (*STORESIZE)(AV *this, int size);
AV *(*SLICE)(AV *this, int start, int end);
AV *(*SPLICE)(AV *this, int start, int end, AV *replace);
int (*EXISTS)(AV *this, int index);
} vtable_av;
but I really would like to see something in C++ allowing to access a
vtable without having to pass the `this' pointer to it, like
AV *my_av;
int i = my_av->FETCHSIZE();
bool exists = my_av->EXISTS(4);
And maybe (that's hard) the possibility to define a new vtable
implementation by extending the base vtable structure and being able
to program it considering the `this' pointer like the use above.
Perhaps a C proxy like the one described for perl would help to do it
and keep it also accessible via C. Maybe one macro that creates all
the code needed to create a `real' C struct vtable_xx and set all the
hooks to the new class' methods.
Well, that's about all I've thought about it. But I want to hear your
comments about it... please?
- Branden
References:
* RFC 159: althought dealing with the subject in Perl level, this is
much based on that.
* RFC 14: Filehandles (also RFC 47 about universal async io)
* RFC 43: integrate BigInts with basic scalars
* RFC 35: A proposed internal base format for perl variables.
_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.