http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49365
--- Comment #5 from Changpeng Fang <changpeng.fang at amd dot com> 2011-06-14 22:22:11 UTC --- It seems there is a prefetch generation bug on Bulldozer. With -O3 -ffast-math -funroll-loops -fpeel-loops -march=bdver1 -fprefetch-loop-arrays, I got a normal timing of 795s. However, when "--param min-insn-to-prefetch-ratio=9" is added, the timing becomes 2853s. This may be a different bug, in the opposite direction to amdfam10 I also want to mention here that software prefetching was actually enabled at -O3 and higher for Bulldozer, when Honza cleaned up the code in i386.c http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00573.html