Bytecode format redesign

Melvin Smith Fri, 03 May 2002 22:35:45 -0700

Reposted to the list so people can comment.

As per the IRC discussion with Dan.....


I've made some progress, not all there, but getting there.
I have the loader handling arbitrary byteordering, now I'm
working on wordsize transforms.

The good thing here is I'm documenting the code as I go, so
hopefully afterward, everyone will easily be able to understand
the bytecode format.

I decided to store the byteorder as a 8byte matrix separate
from the PARROT_MAGIC value. The reason is, its way easier
to store the byteorder as a 0based transform matrix and just
read it in, verify the elements, and use it, rather than ugly
code to detect the byteorder based on the MAGIC. Also, MAGIC
is 32bits, and I don't want to make a design decision that says
we predict 64bit byteorder based on a 32bit value (the 64bit
platforms I know either do big or little endian, but there is no
reason to limit it).

So what I do is read in the byteorder matrix, transform it with
the native matrix, and use that to transform bytecodes
in the rest of the file. The routine works 80% as fast a hardcoded
routines with #ifdefs, however the advantage is we can support
any byteorder/wordsize that an assembler can write.

Here is the format so far, I'm going to work on the segment
headers and symbol table next.


struct PackFile_Header {
     char wordsize;
     char major;
     char minor;
     char flags;
     char pad[4];
     char byteorder[8];
     /* Start words/opcodes on 8-byte boundary */
     opcode_t magic;
     opcode_t opcodetype;
     opcode_t fixup_ss;
     opcode_t const_ss;
     opcode_t bytecode_ss;
};

After the byteorder transform is setup, I read in the MAGIC which is
stored in the originator byteorder, transform it, then check it.
Gives a way to both verify the file as a .pbc as well as verify the byteorder.

I'm testing it with a util that prepends the new header onto an old .pbc;
which assembler should I patch when the format is finalized? Old, new, both?

-Melvin

PS:
Dan brought up an interesting idea of the opcode_type, where we store
the flavor of the opcode, whether it is Perl, or transformed opcodes from
another VM (Java, Python, .NET). Might make for some interesting discussion,
especially given Leon Brocard's JVM experimentation.

Bytecode format redesign

Reply via email to