http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49089
Summary: Regression on CFP2006 on Bulldozer From Splitting AVX
32-byte Unaligned Loads
Product: gcc
Version: 4.7.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: [email protected]
ReportedBy: [email protected]
The regression is caused by the following patch that splits AVX 32-byte
unaligned load and store:
http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01839.html
Here is the performance impact on a Bulldozer System:
store-split load-split
410.bwaves 0.48 -0.48
416.gamess 0.55 0.00
433.milc 1.76 -3.96
434.zeusmp 3.48 -3.48
435.gromacs 0.51 1.54
436.cactusADM -0.72 -0.72
437.leslie3d 10.33 -0.94
444.namd 1.03 0.00
447.dealII 0.70 -1.41
450.soplex 0.79 0.40
453.povray -0.50 -0.50
454.calculix 5.07 -1.84
459.GemsFDTD 4.33 -6.25
465.tonto 1.27 0.00
470.lbm -0.86 1.44
481.wrf 1.35 -3.59
482.sphinx3 0.00 -2.11
geomean 1.71 -1.31
While splitting store is good, Bulldozer seems not like unaligned
load splitting.