Le mercredi 13 décembre 2006 23:09, Denis Vlasenko a écrit : > C++ doesn't specify that compiler shall unroll loops, so it cannot be > classified as "real" bug.
OK, but then, even if I explicitly ask gcc to unroll loops with -funroll-loops, it still doesn't unroll them completely and is still as slow. See bug report here: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30201 > Re code: I would use memset + just a single no, in this example the numbers are double, but in my template library the type is a "typename T" and I can make no assumption as to the bit representation of static_cast<T>(0). > loop anyway... you C++ people tend to overtax compiler with > optimizations. Is it really necessary to do (i == j) * factor > when (i == j) ? factor : 0 is easier for compiler to grok? Of course I tried it. It's even slower. Doesn't help the compiler unroll the loop, and now there's a branch at each iteration. > Template lib for vector and matrix math sounds like a performance > disaster in the making, at least for me. However, maybe you are > truly smart guy and can do miracles. I don't understand why you say that. At the language specification level, templates come with no inherent speed overhead. All of the template stuff is unfolded at compile time, none of it remains visible in the binary, so it shouldn't make the binary slower. Benoit
pgphvVzwRwvyK.pgp
Description: PGP signature