>From TODO: Metadata (source line number info, symbol table) Currently parrot the line number information in parrot is done via special opcodes, namely setline/getline and setfile/getfile. This is a good solution when you write an interpreter in parrot, and the line number information is only known at runtime. But this approach is very inefficient if you have a tight loop like this:
$i = 0; while ($i < 1000) { $i++; } With linenumber information enabled this would translate to something like this setline 1 set I0,0 LOOP: setline 2 lt I0, 1000, DONE setline 3 add I0,1 branch LOOP DONE: setline 5 This is inefficient, because there are two setlines in the loops. A possible solution to this problem is doing it the same way the c-compiler does: Add an extra structure to the executable which can translate the current program counter to the source line. The advantage of this approach is that the linenumber is only decoded when its needed, and only the application which uses the line number information has a runtime cost; the disadvantage is that the line number information must be known at compile time (which I think is the common case). This can be implemented in 2 ways: - Create our own debugging format - Use an already existing one The first way might be more fun, but I think the second one would be better. IMHO we should use DWARF-2. The Mono Project does something similar. To get this working 3 things must happen. 1.) Extending of the packfile format to contain a section with debugging information. Changing the packfile is not an easy task, because many parts of parrot depend on it. The ones I remember are packfile.c assemble.pl and somewhere in imcc. In principle the packfile is extendible in a backward compatible way. At the moment there are (according to parrotbyte.pod) 3 segments (FIXUP, CONSTANT, BYTECODE) in exactly that order. This can be easily extended by just adding a 4th one DEBUG_LINE (or .debug_line or ..stabs). But doing some more extensions (e.g. call frames, language dependent sections) by allocating numbers in a linear chain will be painful. Another extension scheme would be make the 4th section a directory section, in which all packfile extend-sections can be looked up by name. This is still a backward-compatible solution. But why use the 4th section as directory section. Naturally it would be the first one. Since FIXUP is not used at the moment, this is not such a drastic change as it first sounds. 2.) The assembler must emit the debugging information. Emitting the debugging information from pure assembly code is not really complicated, because the address and linenumber are always increasing, the address increment is defined only by the current line and the basic blocks can be easily analyzed. But there must also be a way the higher level languages can assign line numbers. Maybe C-like #line 1 "foo.c" directives are a solution. or create dedicated assembler macros ..line ..file (maybe) .column 3.) The debugger must read this information. I have some ugly little code lying around reading the line number information out of an ELF binary. I can fix this up and integrate it, but not doing the last step first. Bonus point.) Teach the JIT-engine to translate the line number information, so that you can debug a JITed program with gdb. Comments? b. -- Juergen Boemmels [EMAIL PROTECTED] Fachbereich Physik Tel: ++49-(0)631-205-2817 Universitaet Kaiserslautern Fax: ++49-(0)631-205-3906 PGP Key fingerprint = 9F 56 54 3D 45 C1 32 6F 23 F6 C7 2F 85 93 DD 47