4.7 Regression] 30% performance slowdown in floating-point code caused by r118475

lucier at math dot purdue.edu Sat, 02 Apr 2011 09:58:51 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33928


--- Comment #121 from lucier at math dot purdue.edu 2011-04-02 16:58:16 UTC ---
I'm inclined to close this as "Fixed" for 4.6.0.

I've taken the file mentioned in the previous comment and followed the
instructions in the readme.  The times for a forward FFT of 2^{25} complex
doubles on a 2.4HGz Intel Core i5 on x86_64-apple-darwin10.7.0 are as follows:

With the usual compiler options of

-O1 -fno-math-errno -fschedule-insns2 -fno-trapping-math -fno-strict-aliasing
-fwrapv -fomit-frame-pointer -fPIC -fno-common -mieee-fp

4.5.2:

    2433 ms cpu time (2427 user, 6 system)

4.6.0:

    2158 ms cpu time (2154 user, 4 system)

Adding -fschedule-insns -march=native to the above:

4.5.2:

    2067 ms cpu time (2060 user, 7 system)

4.6.0:

    2016 ms cpu time (2012 user, 4 system)

The assembly for the main loop looks much better.

[Bug rtl-optimization/33928] [4.3/4.4/4.5/4.6/4.7 Regression] 30% performance slowdown in floating-point code caused by r118475

Reply via email to