Re: Bytecode metadata

gregor Tue, 04 Feb 2003 05:45:32 -0800

b. --

I agree that under normal circumstances the bytecode is primary.
I was observing that as more and more metadata is considered,
eventually its quantity (measured, say, in bytes) could approach
or even exceed that of the raw bytecode. In cases where one
would feel such a quantity of metadata is needed, it may not
always be necessary to get greased-weasel speed-of-loading
(but, see below).

I understand the the mmap-and-go idea, although it doesn't
always work out even when mmap is available (for example,
prederef requires a side pointer-array to store its prederef
results). Sometimes its mmap-mumble-go (but, see below).

Certainly, there is nothing to prevent one from having
the linearized bytecode pregenerated in the PBC file even
when a metadata tree is also present (the tree could reference
contiguous chunks of that bytecode by offset-size pairs). If
you don't care about the tree, you don't process it. If you do
process it, you probably produce an index data structure mapping
byte code offsets to tree nodes for the debugger. I believe
we retain high speed with this approach.

We do need to consider how the metadata makes it from the
compiler *through* IMCC to land in the PBC file. The compiler
needs to be able to produce annotated input to IMCC, and IMCC
needs to be able to retain the annotations while it makes its
code improvements and rendering (register mapping, etc.).
I'm thinking that, too, could possibly be a tree. IMCC can pick out
the chunks of IMC, generate bytecode, and further annotate the
tree with the offset and size of the generated PBC chunk. The
tree can be retained as the metadata segment in the PBC file.

Regards,

-- Gregor

Juergen Boemmels <[EMAIL PROTECTED]>
Sent by: [EMAIL PROTECTED]
02/04/2003 08:15 AM

        To:     [EMAIL PROTECTED]
        cc:     Perl6 Internals <[EMAIL PROTECTED]>
        Subject:        Re: Bytecode metadata

[EMAIL PROTECTED] writes:

> Mike --
> 
> Thats a lot of metadata. Sounds like maybe the metadata is primary
> and the bytecode is secondary, in which case perhaps what you
> really want is a (metadata) tree decorated with bytecode rather than
> a (bytecode) array decorated with metadata.

The bytecode is primary. This is whats get executed, this is what
needs too be fast (both in startup time and runtime). Some kind of
data is necessary for the bytecode, such as the string
constants. These need also be accessed fast (don't know if this is
called metadata, this is more data). The metadata is only needed in
rare cases e.g. debugging, so it doesn't need to be as fast (but even
here speed is nice)

> Of course, the most natural candidate for the metadata would be the
> annotated (file & line, etc.) parse tree, or some approximation to it
> after compilation-related transformations.
>
> I can imagine a process that loads the tree, and linearizes the
> bytecode with the metadata consisting of backpointers to nodes of
> the tree, either in band as escaped noop-equivalent bytecode or
> out of band in an offset-pointer table.

Bytecode reading must be fast. Ideally it is mmap and start.
Treewalking for bytecodegeneration should be done by the compiler.

> With a suitable amount of forethought on the tree representation,
> you should be able to have good flexibility while still having enough
> standardization on how tree-emitting compilers represent typical
> debug-related metadata (file, line, etc.) that debuggers and other
> tools could be generic. 

The tree metadata can sure be some kind of intermediate output of the
compiler (the output of the compiler front end), but normaly this
should be fed into a backend which generates fast running bytecode or
even native code.

bye
b.
-- 
Juergen Boemmels [EMAIL PROTECTED]
Fachbereich Physik                                               Tel: 
++49-(0)631-205-2817
Universitaet Kaiserslautern                              Fax: 
++49-(0)631-205-3906
PGP Key fingerprint = 9F 56 54 3D 45 C1 32 6F  23 F6 C7 2F 85 93 DD 47

Re: Bytecode metadata

Reply via email to