------- Comment #49 from lucier at math dot purdue dot edu 2009-04-23 15:58
-------
With 4.4.0 and with mainline this code now runs in 280 ms instead of in 156 ms
with 4.2.4.
Since 280/156 = 1.794871794871795 I changed the subject line (the slowdown is
now not completely caused by r118475).
I guess I'll post the assembly code generated by 4.4.0 in the next attachment.
Timings (best of three runs) for the last
(time (direct-fft-recursive-4 a table))
from
gsi/gsi -e '(define a (time (expt 3 10000000)))(define b (time (* a a)))'
With gcc-4.1.2:
188 ms cpu time (188 user, 0 system)
With gcc-4.2.4
156 ms cpu time (152 user, 4 system)
With gcc-4.3.3:
180 ms cpu time (180 user, 0 system)
With gcc-4.4.0
280 ms cpu time (280 user, 0 system)
With 4.5.0 20090423 (experimental) [trunk revision 146634]
280 ms cpu time (280 user, 0 system)
--
lucier at math dot purdue dot edu changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[4.3/4.4/4.5 Regression] 30%|[4.3/4.4/4.5 Regression] 79%
|performance slowdown in |performance slowdown in
|floating-point code caused |floating-point code
|by r118475 |partially caused by r118475
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33928