On Apr 9, 2013, at 22:19, Segher Boessenkool <seg...@kernel.crashing.org> wrote:
> Some numbers, 16-core 64-thread POWER7, c,c++,fortran bootstrap: > -j6: real 57m32.245s > -j60: real 38m18.583s Yes, these confirm mine. It doesn't make sense to look at more parallelization before we address the serial bottlenecks. The the -j6 parallelism level is about where current laptops are. Having a big machine doesn't do as much as having fewer, but faster cores. We should be able to do far better. I don't know how the Power7 threads compare in terms of CPU throughput, but going from -j6 to -j48 on our 48-core AMD system should easily yield a 6x speed up as all are full cores, but we get similar limited improvements to yours, and we get almost perfect scaling in many test suite runs that are dominated by compilations. The two obvious issues: 1. large sequential chains of compiling/running genattrtab followied by compiling insn-attrtab.c and linking the compiler 2. repeated serial configure steps For 1. we need to somehow split the file up in smaller chunks. For 2. we need to have efficient caching. Neither is easy... -Geert