on the implementation of top-level and cross-module inlining

Andy Wingo Wed, 16 Mar 2016 04:15:00 -0700

As a follow-on to yesterday's note about the semantics of top-level and
cross-module inlining, here's a note about the implementation.


There are three major pieces of an implementation of top-level and
cross-module inlining, as far as I can see it.

  1. Recording inlinable top-level definitions

  2. Reifying those inlinable definitions to a compiled file, and being
     able to load them up again, to facilitate cross-module inlining

  3. Actually doing the inlining.

Let's start with the last one first, and say that we are going to use
peval.  Whenever peval visits a toplevel reference, whether in operator
position or for its value, peval will go to look for an inlinable
definition.  If it finds one, it starts a new counter if needed (to
limit code growth and inliner effort) and visits the inlinable
definition in the context of its use.  That way we take advantage of all
of the peval facilities, we add synergy (wooo) to the existing online
lexical reductions that peval can do, and we avoid writing another
inliner.  Coolio.

This choice means that we need to represent inlinable definitions as
Tree-IL, because peval operates on Tree-IL.  OK!  I think this is a good
choice for another reason too: when compiling a module A which uses a
module B, the set of inlinable definitions offered by B should not
depend on whether B is already compiled or not.  That is, B could be
interpreted or compiled.  If it is compiled, we will need to be able
fetch the inlinable definitions from the object file in some way.  But
if it's interpreted, what do we do?

That gets us to point (1).  My proposal would be that psyntax records
adds top-level definitions to a map of name -> TreeIL, stored in the
module as a field.  Peval would revisit the expression before adding it
to ensure that it is actually inlinable (following the definitions from
yesterday's mail).  I am not sure yet how to deal with the
order-of-evaluation concerns from the end of yesterday's mail.  Psyntax
would also mark assigned top-level bindings, making them ineligible for
inlining.

That would seem to be sufficient for (1), both for interpreted modules
and for top-level definitions within the current compilation unit.

However, we need to also get definitions into a compiled file, in such a
way that adds no overhead to the time or space used when later loading
that compiled file.  The best option would be a separate ELF section
that would only be loaded up when when a user of a module wants to check
if there is an available inlinable definition.  However this is
complicated in many ways; you would need to pass this information
through the compilation pipeline and it gets tricky.  You could reify
the Tree-IL directly into the ELF image, but it needs relocation to tie
all the pointers together at run-time and you would need to be careful
to not cause this relocation to happen on a "normal" load where you
don't need the compile-time information.  Note also that there are many
constants in a Tree-IL term and you would want to de-duplicate them.
Also, some passes still mutate Tree-IL, I think; perhaps that's a
concern for the ELF strategy.

All of this leads me to think that maybe an ELF section is the wrong
thing.  Maybe we should just define a little bytecode for serializing
Tree-IL, and write it out as a Scheme bytevector literal.  In that way
the data stays compact and ends up in the read-only data segment.  Then
we could munge the program being compiled to append an
"(set-module-compiled-inlinable-definitions! (current-module)
#vu8(...))" invocation and we'd be done.  There would be a special
routine available in the compiler to "interpret" that bytecode to search
for specific inlinable definitions.  Having to muck about and
pattern-match the imperative module API is pretty nasty, but OK I think;
the broader solution for that is elsewhere.  (We need a better story for
how to map modules to files anyway in order to be able to link multiple
modules from separate .scm files into a single compiled .so file.)

The bytecode would probably need some constant deduplication facilities
-- probably a constant table at the end.  Dunno.

To allow for new definitions in a module, probably the search for an
inlinable definition should go first to the map of Tree-IL values, to
catch anything that was compiled/interpreted at run-time, and then to
the compiled bytevector.

One final note.  Peval will propagate all constant literals that it
can.  This is fine because within a compilation environment, constant
literals will get deduplicated as needed.  But across compilation units
this is not the case, at least not until we have the static linking
hacks I mentioned previously.  So, there should be some relatively
strict size limits on inlining non-immediate values.

OK, that's the end of my brain-dump :)  Happy hacking,

Andy

on the implementation of top-level and cross-module inlining

Reply via email to