On Wed, Oct 2, 2013 at 6:32 PM, David Malcolm <dmalc...@redhat.com> wrote: > This is very much a proof-of-concept/work-in-progress at this stage, but > attached is a patch to GCC which aims to provide an embeddable > JIT-compilation API, using GCC as the backend: libgccjit.so. > > This shared library can then be dynamically-linked into bytecode > interpreters and other such programs that want to generate machine code > "on the fly" at run-time. > > The idea is that GCC is configured with a special --enable-host-shared > option, which leads to it being built as position-independent code. You > would configure it with host==target, given that the generated machine > code will be executed within the same process (the whole point of JIT). > > libgccjit.so is built against libbackend.a. To the rest of GCC, it > looks like a "frontend" (in the "gcc/jit" subdir), but the parsing hook > just runs a callback provided by client code. You can see a diagram of > how it all fits together within the patch (see gcc/jit/notes.txt). The > jit "frontend" requires --enable-host-shared, so it is off by default, > so you need to configure with: > --enable-host-shared --enable-languages=jit > to get the jit (and see caveats below). > > The "main" function is in the client code. It uses a pure C API to call > into libgccjit.so, registering a code creation hook: > > gcc_jit_context *ctxt; > gcc_jit_result *result; > > ctxt = gcc_jit_context_acquire (); > > gcc_jit_context_set_code_factory (ctxt, > some_code_making_callback, user_data); > > /* This actually calls into GCC and runs the build, all > in a mutex for now, getting make a result object. */ > result = gcc_jit_context_compile (ctxt); > /* result is actually a wrapper around a DSO */ > > /* Now that we have result, we're done with ctxt: */ > gcc_git_context_release (ctxt); > > /* Look up a generated function by name, getting a void* back > from the result object (pointing to the machine code), and > cast it to the appropriate type for the function: */ > some_fn_type some_fn = (some_fn_type)gcc_jit_result_get_code (result, > "some_fn"); > > /* We can now call the machine code: */ > int val = some_fn (3, 4); > > /* Presumably we'd call it more than once. > Once we're done with the code, this unloads the built DSO: */ > gcc_jit_result_release (result); > > There are some major kludges in there, but it does work: it can > successfully build code in-process 1000 times in a row [1], albeit with > a slow memory leak, with all optimization turned off. Upon turning on > optimizations I run into crashes owing to not properly purging all state > within the compiler - so this is a great motivation for doing more > state-cleanup work. I've also hacked timevars to run in "cumulative" > mode, accumulating all timings across all iterations. > > The library API hides GCC's internals, and tries to be much more > typesafe than GCC's, giving something rather like Andrew MacLeod's > proposed changes - client code does not see "tree", instead dealing with > types, rvalues, lvalues, jump labels, etc. It is pure C, given the > horror stories I have heard about people dealing with C++ ABIs. FWIW I > also have the beginnings of Python bindings for the library (doing the > interface as pure C makes language-bindings easier), though that would > probably live in a separate repository (so not part of this patch). > > The API deliberately uses C terminology, given that it's likely that the > user will want to be plugging the JIT-generated code into a C/C++ > program (or library). > > I've been able to successfully use this API to add JIT-compilation to a > toy bytecode interpreter: > https://github.com/davidmalcolm/jittest > (where regvm.cc uses this API to compile a bytecode function into > machine code). > > There's a DejaGnu-based test suite, which I can invoke via: > make check-parallel-jit RUNTESTFLAGS="" > (potentially with some -v verbosity options in RUNTESTFLAGS), giving > # of expected passes 144 > and no failures on this box. > > Various caveats: > * Currently it only supports a small subset of C-like code. > * The API is still in flux: I'm not convinced by the label-placement > approach; I suspect having an explicit "block" type may be easier for > users to deal with. > * The patch is against r202664, which is a little out-of-date > (2013-09-17), but I'm interested in feedback rather than perfection at > this stage. > * I'm running into configure/Makefile issues with > --enable-host-shared, where CFLAGS contains -fPIC, but only on > invocations of leaf Makefiles, not on recursive "make" - so it works if > you cd into $builddir/gcc and make (and so on for libcpp etc), but not > from the top-level builddir. Hence building the thing is currently > unreliable (but again, I'm interested in feedback rather than > perfection). Help with configure/Makefiles would be appreciated! > * There are some grotesque kludges in internal-api.c, especially in > how we go from .s assembler files to a DSO (grep for "gross hack" ;) ) > * There are some changes to the rest of GCC that are needed by the JIT > code. Some of this is state removal. Some of the changes are gross, > some are probably reasonable. > * Only tested so far on Fedora and RHEL x86_64 boxes. > > Hopefully this is of interest to other GCC people. > > Shall I get this into a "jit" branch? I greatly prefer git to svn, so > I'd probably do: > http://gcc.gnu.org/wiki/GitMirror#Git-only_branches > assuming that this allows a sane path to (I hope) eventual merger. > > Thoughts?
Neat. Think further ahead, it might better to leave '_jit_' out of the API names -- the APIs can be used by any frontends including alternate ones for C/C++. The APIs can also be used by other consumers such as bitcode writer. thanks, David > Dave > > Current Changelog.jit follows inline: > / > * configure.ac: Add --enable-host-shared > * configure: Regenerate. > > gcc/ > * Makefile.in (LIBIBERTY): Use pic build of libiberty.a if > configured with --enable-host-shared. > (BUILD_LIBIBERTY): Likewise. > * cgraph.c (cgraph_c_finalize): New. > * cgraph.h (symtab_c_finalize): New declaration. > (cgraph_c_finalize): Likewise. > (cgraphunit_c_finalize): Likewise. > (cgraphbuild_c_finalize): Likewise. > (ipa_c_finalize): Likewise. > (predict_c_finalize): Likewise. > (varpool_c_finalize): Likewise. > * cgraphbuild.c (cgraphbuild_c_finalize): New. > * cgraphunit.c (first_analyzed): Move from analyze_functions > to file-scope. > (first_analyzed_var): Likewise. > (analyze_functions): Move static variables into file-scope. > (cgraphunit_c_finalize): New. > * configure.ac: Add --enable-host-shared, adding -fPIC. > * configure: Regenerate. > * dwarf2out.c (dwarf2out_c_finalize): New. > * dwarf2out.h (dwarf2out_c_finalize): Declare. > * ggc-page.c (init_ggc): Make idempotent. > * ipa-pure-const.c (function_insertion_hook_holder): Move to be > a field of class pass_ipa_pure_const. > (node_duplication_hook_holder): Likewise. > (node_removal_hook_holder): Likewise. > (register_hooks): Convert to method... > (pass_ipa_pure_const::register_hooks): ...here, converting > static variable init_p into... > (pass_ipa_pure_const::init_p): ...new field. > (pure_const_generate_summary): Update invocation of > register_hooks to invoke as a method of current_pass. > (pure_const_read_summary): Likewise. > (propagate): Convert to... > (pass_ipa_pure_const::execute): ...method. > * ipa.c (ipa_c_finalize): New. > * main.c (main): Update usage of toplev_main. > * params.c (global_init_params): Make idempotent. > * passes.c (execute_ipa_summary_passes): Set current_pass. > * predict.c (predict_c_finalize): New. > * stringpool.c (init_stringpool): Clean up if we're called more > than once. > * symtab.c (symtab_c_finalize): New. > * timevar.c (timevar_init): Ignore repeated calls. > * timevar.def (TV_CLIENT_CALLBACK): Add. > (TV_ASSEMBLE): Add. > (TV_LINK): Add. > (TV_LOAD): Add. > * toplev.c (do_compile) Add parameter (const toplev_options *); > use it to avoid starting/stopping/reporting timevar TV_TOTAL > for the case where toplev_main does not emcompass all timevars. > (toplev_main): Add parameter (const toplev_options *); pass it > to do_compile. > (toplev_finalize): New. > * toplev.h (struct toplev_options): New. > (toplev_main): Add parameter (const toplev_options *). > (toplev_finalize): New. > * varpool.c (varpool_c_finalize): New. > > gcc/jit/ > * Make-lang.in: New. > * TODO.rst: New. > * config-lang.in: New. > * dummy-frontend.c: New. > * internal-api.c: New. > * internal-api.h: New. > * libgccjit.c: New. > * libgccjit.h: New. > * libgccjit.map: New. > * notes.txt: New. > > gcc/testsuite/ > * jit.dg: New subdirectory > * jit.dg/harness.h: New. > * jit.dg/jit.exp: New. > * jit.dg/test-accessing-struct.c: New. > * jit.dg/test-calling-external-function.c: New. > * jit.dg/test-dot-product.c: New. > * jit.dg/test-factorial.c: New. > * jit.dg/test-failure.c: New. > * jit.dg/test-fibonacci.c: New. > * jit.dg/test-hello-world.c: New. > * jit.dg/test-string-literal.c: New. > * jit.dg/test-sum-of-squares.c: New. > > libbacktrace/ > * configure.ac: Add --enable-host-shared. > * configure: Regenerate. > > libcpp/ > * configure.ac: Add --enable-host-shared. > * configure: Regenerate. > > libdecnumber/ > * configure.ac: Add --enable-host-shared. > * configure: Regenerate. > > libiberty/ > * configure.ac: If --enable-host-shared, use -fPIC. > * configure: Regenerate. > > zlib/ > * configure.ac: Add --enable-host-shared. > * configure: Regenerate. >