Disclaimer: I am not a compiler developer. On Wednesday 13 December 2006 12:44, Benoît Jacob wrote: > I'm developing a Free C++ template library (1) in which it is very important > that certain loops get unrolled, but at the same time I can't unroll them by > hand, because they depend on template parameters. > > My problem is that G++ 4.1.1 (Gentoo) doesn't unroll these loops. > > I have written a standalone simple program showing this problem; I attach it > (toto.cpp) and I also paste it below. This program does a loop if UNROLL is > not defined, and does the same thing but with the loop unrolled by hand if > UNROLL is defined. So one would expect that with g++ -O3, the speed would be > the same in both cases. Alas, it's not: > > g++ -DUNROLL -O3 toto.cpp -o toto ---> toto runs in 0.3 seconds > g++ -O3 toto.cpp -o toto ---> toto runs in 1.9 seconds > > So what can I do? Is that a bug in g++?
C++ doesn't specify that compiler shall unroll loops, so it cannot be classified as "real" bug. # g++ -c -O3 toto.cpp -o toto.o # g++ -DUNROLL -O3 toto.cpp -o toto_unroll.o -c # size toto.o toto_unroll.o text data bss dec hex filename 525 8 1 534 216 toto.o 359 8 1 368 170 toto_unroll.o How can C++ compiler know that you are willing to trade so much of text size for performance? I usually find myself on opposite side: I use -Os but gcc still eats more space in the name of speed in certain situations. Re code: I would use memset + just a single, non-nested for() loop anyway... you C++ people tend to overtax compiler with optimizations. Is it really necessary to do (i == j) * factor when (i == j) ? factor : 0 is easier for compiler to grok? > If yes, any hope to see it fixed soon? > > Cheers, > Benoit > > (1) : Eigen, see http://eigen.tuxfamily.org "Eigen is a lightweight C++ template library for vector and matrix math, a.k.a. linear algebra." Template lib for vector and matrix math sounds like a performance disaster in the making, at least for me. However, maybe you are truly smart guy and can do miracles. Cheers, -- vda