On Tue, Mar 20, 2018 at 3:49 PM, David Malcolm <dmalc...@redhat.com> wrote: > On Tue, 2018-03-20 at 14:02 +0100, Richard Biener wrote: >> On Mon, Mar 19, 2018 at 9:55 PM, Richard Biener >> <richard.guent...@gmail.com> wrote: >> > On March 19, 2018 8:09:32 PM GMT+01:00, Sebastiaan Peters <sebaspe9 >> > 7...@hotmail.com> wrote: >> > > > The goal should be to extend TU wise parallelism via make to >> > > > function >> > > >> > > wise parallelism within GCC. >> > > >> > > Could you please elaborate more on this? >> > >> > In the abstract sense you'd view the compilation process separated >> > into N stages, each function being processed by each. You'd assign >> > a thread to each stage and move the work items (the functions) >> > across the set of threads honoring constraints such as an IPA stage >> > needing all functions completed the previous stage. That allows you >> > to easier model the constraints due to shared state (like no pass >> > operating on two functions at the same time) compared to a model >> > where you assign a thread to each function. >> > >> > You'll figure that the easiest point in the pipeline to try this >> > 'pipelining' is after IPA has completed and until RTL is generated. >> > >> > Ideally the pipelining would start as early as the front ends >> > finished parsing a function and ideally we'd have multiple >> > functions in the RTL pipeline. >> > >> > The main obstacles will be the global state in the compiler of >> > which there is the least during the GIMPLE passes (mostly cfun and >> > current_function_decl plus globals in the individual passes which >> > is easiest dealt with by not allowing a single pass to run at the >> > same time in multiple threads). TLS can be used for some of the >> > global state plus of course some global data structures need >> > locking. >> >> Oh, and just to mention - there are a few things that may block >> adoption in the end >> like whether builds are still reproducible (we allocate things like >> DECL_UID from >> global pools and doing that somewhat randomly because of threading >> might - but not >> must - change code generation). Or that some diagnostics will appear >> in >> non-deterministic order, or that dump files are messed up (both >> issues could be >> solved by code dealing with the issue, like buffering and doing a re- >> play in >> program order). I guess reproducability is important when it comes >> down to >> debugging code-generation issues - I'd prefer to debug gcc when it >> doesn't run >> threaded but if that doesn't reproduce an issue that's bad. >> >> So the most important "milestone" of this project is to identify such >> issues and >> document them somewhere. > > One issue would be the garbage-collector: there are plenty of places in > GCC that have hidden assumptions that "a collection can't happen here" > (where we have temporaries that reference GC-managed objects, but which > aren't tracked by GC-roots). > > I had some patches for that back in 2014 that I think I managed to drop > on the floor (sorry): > https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01300.html > https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01340.html > https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01510.html > > The GC's allocator is used almost everywhere, and is probably not > thread-safe yet.
Yes. There's also global tree modification like chaining new pointer types into TYPE_POINTER_TO and friends so some helpers in tree.c need to be guarded as well. > FWIW I gave a talk at Cauldron 2013 about global state in GCC. Beware: > it's five years out-of-date, but maybe is still relevant in places? > https://dmalcolm.fedorapeople.org/gcc/global-state/ > https://gcc.gnu.org/ml/gcc/2013-05/msg00015.html > (I tackled this for libgccjit by instead introducing a mutex, a "big > compiler lock", jit_mutex in gcc/jit/jit-playback.c, held by whichever > thread is calling into the rest of the compiler sources). > > Hope this is helpful > Dave > > [...]