------- Comment #2 from rguenth at gcc dot gnu dot org 2007-10-01 09:46 ------- With a more recent 4.3 version I get
4.2 4.3 -O2 6.1s 5.1s -O3 4.1s 5.1s both -O3 variants have one function not inlined in the main loop. For 4.2 it is EiEval<EiMatrixProduct<EiMatrix<double, 3, 3>, EiMatrix<double, 3, 3> > > operator*<double, EiMatrix<double, 3, 3>, EiMatrix<double, 3, 3> >(EiObject<double, EiMatrix<double, 3, 3> > const&, EiObject<double, EiMatrix<double, 3, 3> > const&) for 4.3 it is EiMatrix<double, 3, 3>& EiObject<double, EiMatrix<double, 3, 3> >::operator=<EiSum<EiMatrix<double, 3, 3>, EiScalarProduct<EiSum<EiMatrix<double, 3, 3>, EiMatrix<double, 3, 3> > > > >(EiObject<double, EiSum<EiMatrix<double, 3, 3>, EiScalarProduct<EiSum<EiMatrix<double, 3, 3>, EiMatrix<double, 3, 3> > > > > const&) but even complete inlining does not improve numbers much. There's also nothing obvious in the asm - we simply have too little registers and load/store from/to memory very often. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604