Hi,
Sorry for delay in getting to this - been working on-site with $JOB for
a while. Comments and questions below, but please see r15001.
Leopold Toetsch wrote:
2) How should we handle changes to the core Parrot library (mostly PMCs,
but also consider anything we promise is available)? Should this bump
the packfile version number too? Or do we want some other mechanism to
handle this?
This is still a can of worms. Not so much changes to PMC type numberings per se (which should invalidate PBCs)
Yup, after further mulling I think changes to these and
non-backward-compatible interface changes to the built-in PMCs should
cause an entry in PBC_COMPAT and invalidate said resources. Now in the spec.
but the dynamic nature of these resources.
I'll try to dump my thoughts.
A PBC refers - via its contents - to several possibly dynamically extendable
resources. A probably incomplete list is:
1) PMCs [*1]
2) charsets
3) encodings
4) HLLs
5) opcodes
(see also src/pmc/parrotinterpreter.pmc:547 ff) [*2]
Whenever such items are refered to by a numeric index and that index is part of
the PBC, we have a possible problem.
Let's look at opcodes. These are present in the PBC as index (the opcode
number). We got a packfile with some dynamic opcode inside:
opcodes
[ 10, 20, 30, 1300, 1301, 0 ]
Let's say, opcode #1300 and #1301 are from some dynamic opcode lib. Now this PF
gets loaded into an interpreter, which already has dynamic opcode librar{y,ies}
loaded. In the best case, it was the same opcode library and the opcode numbers
just happen to match. But that's pure luck.
The same argument holds for all other above resources.
I have added a dependencies segment that can be used to list all of the
dynamically loaded resources that a bytecode file uses. These can then
be located and loaded and any collisions detected (and once implemented,
resolved) at load-time.
BTW encodings seem to be missing in the pdd - and we can't do:
"Character set, copied from the string structure."
because this is a pointer. We need an index into the available
charsets/encodings.
Fixed this bit, thanks.
So what I think, we have to do, is:
- store a metatable of such resources, this is basically for:
2-4) a list of names / library PMCs, which describes how to load
the resource
(or NULL, if this resource is a core resource)
1,5) same + range of indices
Will a dynamic character set or encoding library that we load not
possibly contain more than one character set or encoding and therefore
need a range of indices too? I have gone with this for now.
Please can you also expand a little on what a HLL resource is? I thought
this was just a dynamic PMC library but where some of those PMCs get
used in place of some built-ins, such as Integer using Perl6Integer
instead or something like this?
- when now a PBC is loaded, we'd have to merge this information with already in-memory structures of the interpreter. We can at least detect, if there's a collision.
We're not doing this at the moment?!
Still better would of course be to relocate the index and use this
mapping during unpacking. Unfortunately we can't do the relocation of opcodes for mmap-ed bytecde in memory.
Sure; we'll probably be able to teach pbc_merge to resolve such
collisions though, so people can merge stuff together and have them
resolved once rather than having to make an unmapped copy each runtime.
Maybe we can find some scheme to make collisions less likely too (we've
got 32 bits to play with, after all).
[*1] theoretically PMCs shouldn't be a problem, as these are usually looked up
dynamically, but it depends of course on the usage of dynamic oplibs :-(
.loadlib "mypmc"
...
new P0, .MyPMC # new_p_ic .MyPMC is refered to by index
new P0, 'MyPMC' # referenced by name
For the index case, we'd again have the described problem.
(The .Type syntax is always fine for core PMCs, which don't change for the
validity range of the packfile).
Yup - unless we only allow .Type for built-ins of course.
Thanks,
Jonathan