>A more likely source of performance degradation is that loop unrolling >is enabled when profiling, and loop unrolling is almost always a bad >pessimization on 32 bits x86 targets.
To clarify, I was compiling with -funroll-loops and -fpeel-loops enabled in both cases. The FDO slowdown in my case was caused by the presence of some loop invariant code that was getting removed from the loop by the loop optimizer pass in the non-FDO case. I'm running on powerpc-linux. Pete