On Sun, 21 Jul 2019, Giuliano Belinassi wrote: > Hi all, > > Here is my second evaluation report, together with a simple program that > I was able to compile with my parallel version of GCC. Keep in mind that > I still have lots of concurrent issues inside the compiler and therefore > my branch will fail to compile pretty much anything else. > > To reproduce my current branch, use the following steps: > > 1-) Clone https://gitlab.com/flusp/gcc/tree/giulianob_parallel > > 2-) Edit gcc/graphunit.c's variable `num_threads` to 1. > > 3-) Compile with --disable-bootstrap --enable-languages=c > > 4-) make > > 5-) Edit gcc/graphunit.c's variable `num_threads` to 2, for instance. > > 6-) make install DESTDIR="somewhere_else_that_doesnt_break_your_gcc" > > 7-) compile the program using -O2 > > I a attaching my report in markdown format, which you can convert to pdf > using `pandoc` if you find it difficult to read in the current format. > > I am also open to suggestions. Please do not hesitate to comment :)
Thanks for the report and it's great that you are making progress! I suggest you add a --param (edit params.def) so one can choose num_threads on the command-line instead of needing to recompile GCC. Just keep the default "safe" so that GCC build itself will still work. For most of the allocators I think that in the end we want to keep most of them global but have either per-thread freelists or a freelist implementation that can work (allocate and free) without locking, employing some RCU scheme. Not introducing per-thread state is probably leaner on the implementation. It would of course mean taking a lock when the freelist needs to be re-filled from the main pool but that's hopefully not common. I don't know a RCU allocator freelist implementation to copy/learn from, but experimenting with such before going the per thread freelist might be interesting. Maybe not all allocators need to be treated equal either. Your memory-block issue is likely that you added { if (!instance) instance = XNEW (memory_block_pool); but as misleading as it is, XNEW doesn't invoke C++ new but just malloc so the allocated structure isn't initialized since it's constructor isn't invoked. Just use instance = new memory_block_pool; with that I get helgrind to run (without complaining!) on your testcase. I also get to compile gimple-match.c with two threads for more than one minute before crashing on some EVRP global state (somehow I knew the passes global state would be quite a distraction...). I hope the project will be motivation to cleanup the way we handle pass-specific global state. Thanks again, Richard.