[RFC] Debug Segment, HLL Debug Segment And Source Segment

Jonathan Worthington Tue, 20 Sep 2005 15:53:10 -0700

Hi,

The current format of the debug segment in Parrot packfiles (.pbc files), asdocumented in doc/parrotbyte.pod, only allows for a single source file to benamed. This became insufficient some time ago since we had .includedirectives; it also means that there's nothing sensible that pbc_merge cando with the debug segments it finds in input files.


WHAT WE HAVE NOW
Currently, we store two things:-

1) The filename of a single source file, as an additional field in theheader2) The line number in the source file for each bytecode instruction, as thesegment's opcode stream


WHAT SOURCE?

The debug segment as we currently have it relates to PIR and PASM sourcefiles, not to high level language source files. Currently PIR parses adirective that looks like this:

   #line 'filename'

This is for compilers to supply the line numbers and file names of HLLsource files. Currently, nothing is done with these directives after theyare parsed, but the data they provide should go into a seperate HLL debugsegment.

As the needs of the PASM/PIR debug segments and the HLL debug segments wouldseem to be the same, this proposal will detail a single format that shouldwork for both of them. If it is determined that the HLL debug segment needssomething more sophisticated, this proposal still stands for the PASM/PIRdebug segment.


SOURCE SEGMENTS

This is currently mentioned in parrotbyte.pod; the idea would seem to bethat this segment can contain source code. I suspect the intention of itwas to store the source code of high level languages rather than PASM orPIR. I think the doc is correct in stating that this segment is currentlyunused. However, in the future it likely will be, so it makes sense toconsider its future existence now while re-designing the debug segment(s).


FORMAT PROPOSAL

The aims of the new format, intended for both the PASM/PIR debug segment andthe HLL debug segment are:

1) Supporting multiple input files
2) Allowing for a reference into the source segment in place of a filename.
3) Still being space-efficient on disk

The opcode stream will contain one line number per bytecode instruction. Noinformation as to what file that line is in will be stored in this stream.(This is pretty much the same as what we have now).

The header (after the standard stuff that every header has) will start witha count of the number of source file to bytecode position mappings that arein the header.


 0 (relative)
 +----------+----------+----------+----------+
 | number of source => bytecode mappings     |
 +----------+----------+----------+----------+

A source to bytecode position mapping simply states that the bytecode thatstarts from the specified offset up until the offset in the next mapping, orif there is none up until the end of the bytecode, has it's source inlocation X.

A mapping always starts with the offset in the bytecode, followed by thetype of the mapping.


 0 (relative)
 +----------+----------+----------+----------+
 |              bytecode offset              |
 +----------+----------+----------+----------+

 4
 +----------+----------+----------+----------+
 |               mapping type                |
 +----------+----------+----------+----------+

There are 3 mapping types.

Type 0 means there is no source available for the bytecode starting at thegiven offset. No further data is stored with this type of mapping; the nextmapping continues immediately after it.

Type 1 means the source is available in a file. A NULL terminated stringcontaining the filename follows.

Type 2 means the source is available in a source segment. Another integerfollows, which will specify which source file in the source segment to use.

Note that the ordering of the offsets into the bytecode must be sequential;a mapping for offset 100 cannot follow a mapping for offset 200, forexample.


COMPATIBILITY

This change is incompatible with the current debug segment format. Butthat's OK, we're still in development.

Comments on this would be very welcome, even if it's as simple as "looks OKto me" or "looks terrible to me". :-)


Thanks,

Jonathan

[RFC] Debug Segment, HLL Debug Segment And Source Segment

Reply via email to