b. -- I agree that under normal circumstances the bytecode is primary. I was observing that as more and more metadata is considered, eventually its quantity (measured, say, in bytes) could approach or even exceed that of the raw bytecode. In cases where one would feel such a quantity of metadata is needed, it may not always be necessary to get greased-weasel speed-of-loading (but, see below).
I understand the the mmap-and-go idea, although it doesn't always work out even when mmap is available (for example, prederef requires a side pointer-array to store its prederef results). Sometimes its mmap-mumble-go (but, see below). Certainly, there is nothing to prevent one from having the linearized bytecode pregenerated in the PBC file even when a metadata tree is also present (the tree could reference contiguous chunks of that bytecode by offset-size pairs). If you don't care about the tree, you don't process it. If you do process it, you probably produce an index data structure mapping byte code offsets to tree nodes for the debugger. I believe we retain high speed with this approach. We do need to consider how the metadata makes it from the compiler *through* IMCC to land in the PBC file. The compiler needs to be able to produce annotated input to IMCC, and IMCC needs to be able to retain the annotations while it makes its code improvements and rendering (register mapping, etc.). I'm thinking that, too, could possibly be a tree. IMCC can pick out the chunks of IMC, generate bytecode, and further annotate the tree with the offset and size of the generated PBC chunk. The tree can be retained as the metadata segment in the PBC file. Regards, -- Gregor Juergen Boemmels <[EMAIL PROTECTED]> Sent by: [EMAIL PROTECTED] 02/04/2003 08:15 AM To: [EMAIL PROTECTED] cc: Perl6 Internals <[EMAIL PROTECTED]> Subject: Re: Bytecode metadata [EMAIL PROTECTED] writes: > Mike -- > > Thats a lot of metadata. Sounds like maybe the metadata is primary > and the bytecode is secondary, in which case perhaps what you > really want is a (metadata) tree decorated with bytecode rather than > a (bytecode) array decorated with metadata. The bytecode is primary. This is whats get executed, this is what needs too be fast (both in startup time and runtime). Some kind of data is necessary for the bytecode, such as the string constants. These need also be accessed fast (don't know if this is called metadata, this is more data). The metadata is only needed in rare cases e.g. debugging, so it doesn't need to be as fast (but even here speed is nice) > Of course, the most natural candidate for the metadata would be the > annotated (file & line, etc.) parse tree, or some approximation to it > after compilation-related transformations. > > I can imagine a process that loads the tree, and linearizes the > bytecode with the metadata consisting of backpointers to nodes of > the tree, either in band as escaped noop-equivalent bytecode or > out of band in an offset-pointer table. Bytecode reading must be fast. Ideally it is mmap and start. Treewalking for bytecodegeneration should be done by the compiler. > With a suitable amount of forethought on the tree representation, > you should be able to have good flexibility while still having enough > standardization on how tree-emitting compilers represent typical > debug-related metadata (file, line, etc.) that debuggers and other > tools could be generic. The tree metadata can sure be some kind of intermediate output of the compiler (the output of the compiler front end), but normaly this should be fed into a backend which generates fast running bytecode or even native code. bye b. -- Juergen Boemmels [EMAIL PROTECTED] Fachbereich Physik Tel: ++49-(0)631-205-2817 Universitaet Kaiserslautern Fax: ++49-(0)631-205-3906 PGP Key fingerprint = 9F 56 54 3D 45 C1 32 6F 23 F6 C7 2F 85 93 DD 47