Re: thinking out loud: wip-rtl, ELF, pages, and mmap

Nala Ginrut Sun, 28 Apr 2013 22:47:45 -0700

On Wed, 2013-04-24 at 22:23 +0200, Andy Wingo wrote:
> Hi,
> 
> I've been working on wip-rtl recently.  The goal is to implement good
> debugging.  I'll give a bit of background and then get to my problem.
> 
> In master, ".go" files are written in the ELF format.  ELF is nice
> because it embodies common wisdom on how to structure object files, and
> this wisdom applies to Guile fairly directly.  To simplify, ELF files
> are cheap to load and useful to introspect.  The former is achieved with
> "segments", which basically correspond to mmap'd blocks of memory.  The
> latter is achieved by "sections", which describe parts of the file.  The
> table of segments is usually written at the beginning of the file, to
> make loading easier, and the table of sections is usually at the end, as
> it's not usually needed at runtime.  There are usually fewer segments
> than sections.  You can have segments in the file that are marked as not
> being loaded into memory at runtime.  Usually this is the case for
> debugging information.
> 
> OK, so that's ELF.  The conventional debugging format to use with ELF is
> DWARF, and it's pretty well thought out.  In Guile we'll probably use
> DWARF, along with some more basic metadata in .symtab sections.
>


I'm very glad to see that ;-)
And we it's possible to debug .go with GDB.

> I should mention that in master, the ELF files are simple wrappers over
> 2.0-style objcode.  The wip-rtl branch takes more advantage of ELF --
> for example, to allocate some constants in read-only shareable memory,
> and to statically allocate any constants that need initialization or
> relocation at runtime.  ELF also has advantages when we start to do
> native compilation: native code can go in another section, for example.
> 

Seems rtl's compiling is faster, at least for boot-9.scm
But I didn't give it a test.

It's possible to have more than one external AOT compiler except the
official inner one. Maybe it's unnecessary.

>                             *   *   *
> 
> OK, so that's the thing.  I recently added support for writing .symtab
> sections, and have been looking on how to load that up at runtime, for
> example when disassembling functions.  To be complete, there are a few
> other common operations that would require loading debug information:
> 
>   * Procedure names.
>   * Line/column information, for example in backtraces.
>   * Arity information and argument names.
>   * Local variable names and live ranges (the ,locals REPL command).
>   * Generic procedure metadata.
> 

And I hope there's the number of begin line and the end line for a
procedure. It's easy to record it when compiling. If no, I have to parse
the source file to confirm it, and provide the source code printing in
REPL/debugger.

> Anyway!  How do you avoid loading this information at runtime?
> 

IMO, we should provide the strip command to guild.
Or vice versa, --debug to the compile option.
Let users decide whether to keep the debug info.

> The original solution I had in mind was to put them in ELF segments that
> don't get loaded.  Then at runtime you would somehow map from an IP to
> an ELF object, and at that point you would lazily load the unloaded ELF
> sections.
> 
> But that has a few disadvantages.  One is that it's difficult to ensure
> that the lazily-loaded object is the same as the one that you originally
> loaded.  We don't keep .go file descriptors open currently, and
> debugging would be a bad reason to do so.
> 
> Another more serious is that this is a lot of work, actually.  There's a
> constant overhead of the data about what is loaded and how to load what
> isn't, and the cross-references from the debug info to the loaded info
> is tricky.
> 
> Then I realized: why am I doing all of this if the kernel has a virtual
> memory system already that does all this for me?
> 
> So I have a new plan, I think.  I'll change the linker to always emit
> sections and segments that correspond exactly in their on-disk layout
> and in their in-memory layout.  (In ELF terms: segments are contiguous,
> with p_memsz == p_filesz.)  I'll put commonly needed things at the
> beginning, and debugging info and the section table at the end.  Then
> I'll just map the whole thing with PROT_READ, and set PROT_WRITE on
> those page-aligned segments that need it.  (Obviously in the future,
> PROT_EXEC as well.)
> 

Yeah, when we have AOT ;-P

> Then I'll just record a list of ELF objects that have been loaded.
> Simple bisection will map IP -> ELF, and from there we have the section
> table in memory (lazily paged in by the virtual memory system) and can
> find the symtab and other debug info.
> 
> So that's the plan.  It's a significant change, and I wondered if folks
> had some experience or reactions.
> 
> Note that we have a read()-based fallback if mmap is not available.
> This strategy also makes the read-based fallback easier.
> 
> Thoughts?
> 
> Andy

Re: thinking out loud: wip-rtl, ELF, pages, and mmap

Reply via email to