Benoît Jacob <[EMAIL PROTECTED]> writes: > I'm developing a Free C++ template library (1) in which it is very important > that certain loops get unrolled, but at the same time I can't unroll them by > hand, because they depend on template parameters. > > My problem is that G++ 4.1.1 (Gentoo) doesn't unroll these loops. > > I have written a standalone simple program showing this problem; I attach it > (toto.cpp) and I also paste it below. This program does a loop if UNROLL is > not defined, and does the same thing but with the loop unrolled by hand if > UNROLL is defined. So one would expect that with g++ -O3, the speed would be > the same in both cases. Alas, it's not:
When I try it, gcc does unroll the loops. It completely unrolls the inner loop, but only partially unrolls the outer loop. The reason it doesn't completely unroll the outer loop is simply that gcc doesn't attempt to completely unroll loops which contain inner loops. This could probably be fixed: we could probably completely unroll a loop if all its inner loop were completely unrolled. I encourage you to file a bug report. See http://gcc.gnu.org/bugs.html. Ian