Dear all, === problem description ===
Mixed T - complex<T> algebra is horribly inefficient in svn-trunk. This is demonstrated by the attached testcase 'test.cpp'. *** gcc-4.4.1: $ g++ -O2 test.cpp && time echo 5000000000 | ./a.out ... user 0m8.817s $ g++ -DUSE_MY_CMULT -O2 test.cpp && time echo 5000000000 | ./a.out ... user 0m8.825s *** gcc 4.5.0 20100217 (revision 156834): $ g++ -O2 test.cpp && time echo 5000000000 | ./a.out ... user 0m28.574s # Aaaaaaargh!!! $ g++ -DUSE_MY_CMULT -O2 test.cpp && time echo 5000000000 | ./a.out ... user 0m8.817s === analysis === The libstdc++ implementation of std::complex<T>::operator*=(T val) does a '/= val' on its '__complex__ T' data representation (_GLIBCXX_USE_C99_COMPLEX is enabled on my OpenSUSE GNU/Linux box). In present gcc, 'val' is promoted to a (degenerate) complex type. Next, in 'expand_complex_multiplication' (tree-complex.c) this leads to the emission of a libcall (to __muldc3), since -fcx-limited-range is not in effect, so no reduction is allowed. Indeed, when the straightforward custom operator*= is active in the testcase above (-DUSE_MY_CMULT), we get back the performance that we wish: this version simply multiplies the scalar components of the complex number separately. Would a <complex> patch that implements scalar multiplication (and division) of complex specialisations this way be accepted? The factor 3 performance loss is unacceptable for any serious numerical work in C++. === Discussion, caveats, related issues === * such patch would not address the C99-issue of complex*scalar; but at least math-oriented C++ users will be happy. * If this change is implemented, PR28408 is a C(99)-only issue. All signs are then correct, the inf/nan issues have apparently been previously fixed. * From the C++0x working draft (26.4.6, item 7): template<class T> complex<T> operator*(const complex<T>& lhs, const complex<T>& rhs); template<class T> complex<T> operator*(const complex<T>& lhs, const T& rhs); template<class T> complex<T> operator*(const T& lhs, const complex<T>& rhs); Returns: complex<T>(lhs) *= rhs. Note the third overload: of course this should be implemented efficiently as 'return complex<T>(rhs) *= lhs', which is what libstdc++ does --- fortunately. This is however *not* the same as the return value which is specified by the standard (think signed zeros). Is this a libstdc++ bug, or an editorial oversight in N3000=09-0190. In the last case, is someone on this list in the position to raise this issue? -- Summary: mixed complex<T> multiplication horribly inefficient Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: jan at epgmod dot phys dot tue dot nl GCC host triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43108