On Sun, 21 Jul 2019, Giuliano Belinassi wrote:

> Hi all,
> 
> Here is my second evaluation report, together with a simple program that
> I was able to compile with my parallel version of GCC. Keep in mind that
> I still have lots of concurrent issues inside the compiler and therefore
> my branch will fail to compile pretty much anything else.
> 
> To reproduce my current branch, use the following steps:
> 
> 1-) Clone https://gitlab.com/flusp/gcc/tree/giulianob_parallel
> 
> 2-) Edit gcc/graphunit.c's variable `num_threads` to 1.
> 
> 3-) Compile with --disable-bootstrap --enable-languages=c
> 
> 4-) make
> 
> 5-) Edit gcc/graphunit.c's variable `num_threads` to 2, for instance.
> 
> 6-) make install DESTDIR="somewhere_else_that_doesnt_break_your_gcc"
> 
> 7-) compile the program using -O2
> 
> I a attaching my report in markdown format, which you can convert to pdf
> using `pandoc` if you find it difficult to read in the current format.
> 
> I am also open to suggestions. Please do not hesitate to comment :)

Thanks for the report and it's great that you are making progress!

I suggest you add a --param (edit params.def) so one can choose
num_threads on the command-line instead of needing to recompile GCC.
Just keep the default "safe" so that GCC build itself will still work.

For most of the allocators I think that in the end we want to
keep most of them global but have either per-thread freelists
or a freelist implementation that can work (allocate and free)
without locking, employing some RCU scheme.  Not introducing
per-thread state is probably leaner on the implementation.
It would of course mean taking a lock when the freelist needs to
be re-filled from the main pool but that's hopefully not common.
I don't know a RCU allocator freelist implementation to copy/learn
from, but experimenting with such before going the per thread freelist
might be interesting.  Maybe not all allocators need to be treated
equal either.

Your memory-block issue is likely that you added

{
  if (!instance)
    instance = XNEW (memory_block_pool);

but as misleading as it is, XNEW doesn't invoke C++ new but
just malloc so the allocated structure isn't initialized
since it's constructor isn't invoked.  Just use

    instance = new memory_block_pool;

with that I get helgrind to run (without complaining!) on your
testcase.  I also get to compile gimple-match.c with two threads
for more than one minute before crashing on some EVRP global
state (somehow I knew the passes global state would be quite a
distraction...).

I hope the project will be motivation to cleanup the way we
handle pass-specific global state.

Thanks again,
Richard.

Reply via email to