http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43247
--- Comment #10 from Thiago Macieira <thiago at kde dot org> 2010-12-22 10:35:23 UTC --- This is still not fixed. I can reproduce now with a different testcase, in 4.5.1. However, this time, the same code works fine in 4.4. The reason is again accessing an array out-of-bounds for elements that we know to be there. Pay attention to the way operator== is implemented in the following code. If I compile it with -O1, it prints "true" as it should. If I compile it with -O2, it prints "false". If I compile it with -O1 -finline-small-functions -finline -findirect-inlining -fstrict-overflow and compare the disassembly with -O2 and a suitable list of -fno-*, the code is exactly identical, except for some instructions that should perform the copy of half of m1's data into m3. So in the end the comparison fails due to comparing to garbage. === code === #include <stdio.h> template <int N, int M, typename T> class QGenericMatrix { public: QGenericMatrix(); QGenericMatrix(const QGenericMatrix<N, M, T>& other); explicit QGenericMatrix(const T *values); bool operator==(const QGenericMatrix<N, M, T>& other) const; private: T m[N][M]; // Column-major order to match OpenGL. QGenericMatrix(int) {} // Construct without initializing identity matrix }; template <int N, int M, typename T> QGenericMatrix<N, M, T>::QGenericMatrix(const QGenericMatrix<N, M, T>& other) { for (int col = 0; col < N; ++col) for (int row = 0; row < M; ++row) m[col][row] = other.m[col][row]; } template <int N, int M, typename T> QGenericMatrix<N, M, T>::QGenericMatrix(const T *values) { for (int col = 0; col < N; ++col) for (int row = 0; row < M; ++row) m[col][row] = values[row * N + col]; } template <int N, int M, typename T> bool QGenericMatrix<N, M, T>::operator==(const QGenericMatrix<N, M, T>& other) const { for (int index = 0; index < N * M; ++index) { if (m[0][index] != other.m[0][index]) return false; } return true; } typedef double qreal; typedef QGenericMatrix<2, 2, qreal> QMatrix2x2; int main(int , char**) { qreal m1Data[] = {0.0, 0.0, 0.0, 0.0}; QMatrix2x2 m1(m1Data); QMatrix2x2 m3 = m1; puts((m1 == m3) ? "true" : "false"); } === code === common args: -fno-exceptions -fno-rtti -fverbose-asm -march=core2 -mfpmath=sse (though x87 math also shows the same problem) prints "true" with: -O1 -finline-small-functions -finline -findirect-inlining -fstrict-overflow prints "false" with: -O2 -fno-align-functions -fno-align-jumps -fno-align-labels -fno-caller-saves -fno-tree-switch-conversion -fno-tree-vrp -fno-crossjumping -fno-cse-follow-jumps -fno-expensive-optimizations -fno-gcse -fno-ipa-cp -fno-ipa-sra -fno-optimize-register-move -fno-optimize-sibling-calls -fno-peephole2 -fno-regmove -fno-reorder-blocks -fno-reorder-functions -fno-rerun-cse-after-loop -fno-schedule-insns2 -fno-strict-aliasing -fno-strict-aliasing -fno-thread-jumps -fno-tree-builtin-call-dce -fno-tree-pre