On Apr 9, 2013, at 22:19, Segher Boessenkool <seg...@kernel.crashing.org> wrote:

> Some numbers, 16-core 64-thread POWER7, c,c++,fortran bootstrap:
> -j6:  real    57m32.245s
> -j60: real    38m18.583s

Yes, these confirm mine. It doesn't make sense to look at more
parallelization before we address the serial bottlenecks.
The the -j6 parallelism level is about where current laptops
are. Having a big machine doesn't do as much as having fewer,
but faster cores. 

We should be able to do far better. I don't know how the Power7
threads compare in terms of CPU throughput, but going from -j6 to
-j48 on our 48-core AMD system should easily yield a 6x speed up
as all are full cores, but we get similar limited improvements to
yours, and we get almost perfect scaling in many test suite runs
that are dominated by compilations.

The two obvious issues:
  1. large sequential chains of compiling/running genattrtab followied
     by compiling insn-attrtab.c and linking the compiler
  2. repeated serial configure steps

For 1. we need to somehow split the file up in smaller chunks.
For 2. we need to have efficient caching.

Neither is easy...

  -Geert

Reply via email to