On Tue, 2018-03-20 at 14:02 +0100, Richard Biener wrote: > On Mon, Mar 19, 2018 at 9:55 PM, Richard Biener > <richard.guent...@gmail.com> wrote: > > On March 19, 2018 8:09:32 PM GMT+01:00, Sebastiaan Peters <sebaspe9 > > 7...@hotmail.com> wrote: > > > > The goal should be to extend TU wise parallelism via make to > > > > function > > > > > > wise parallelism within GCC. > > > > > > Could you please elaborate more on this? > > > > In the abstract sense you'd view the compilation process separated > > into N stages, each function being processed by each. You'd assign > > a thread to each stage and move the work items (the functions) > > across the set of threads honoring constraints such as an IPA stage > > needing all functions completed the previous stage. That allows you > > to easier model the constraints due to shared state (like no pass > > operating on two functions at the same time) compared to a model > > where you assign a thread to each function. > > > > You'll figure that the easiest point in the pipeline to try this > > 'pipelining' is after IPA has completed and until RTL is generated. > > > > Ideally the pipelining would start as early as the front ends > > finished parsing a function and ideally we'd have multiple > > functions in the RTL pipeline. > > > > The main obstacles will be the global state in the compiler of > > which there is the least during the GIMPLE passes (mostly cfun and > > current_function_decl plus globals in the individual passes which > > is easiest dealt with by not allowing a single pass to run at the > > same time in multiple threads). TLS can be used for some of the > > global state plus of course some global data structures need > > locking. > > Oh, and just to mention - there are a few things that may block > adoption in the end > like whether builds are still reproducible (we allocate things like > DECL_UID from > global pools and doing that somewhat randomly because of threading > might - but not > must - change code generation). Or that some diagnostics will appear > in > non-deterministic order, or that dump files are messed up (both > issues could be > solved by code dealing with the issue, like buffering and doing a re- > play in > program order). I guess reproducability is important when it comes > down to > debugging code-generation issues - I'd prefer to debug gcc when it > doesn't run > threaded but if that doesn't reproduce an issue that's bad. > > So the most important "milestone" of this project is to identify such > issues and > document them somewhere.
One issue would be the garbage-collector: there are plenty of places in GCC that have hidden assumptions that "a collection can't happen here" (where we have temporaries that reference GC-managed objects, but which aren't tracked by GC-roots). I had some patches for that back in 2014 that I think I managed to drop on the floor (sorry): https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01300.html https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01340.html https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01510.html The GC's allocator is used almost everywhere, and is probably not thread-safe yet. FWIW I gave a talk at Cauldron 2013 about global state in GCC. Beware: it's five years out-of-date, but maybe is still relevant in places? https://dmalcolm.fedorapeople.org/gcc/global-state/ https://gcc.gnu.org/ml/gcc/2013-05/msg00015.html (I tackled this for libgccjit by instead introducing a mutex, a "big compiler lock", jit_mutex in gcc/jit/jit-playback.c, held by whichever thread is calling into the rest of the compiler sources). Hope this is helpful Dave [...]