Hello, > >A more likely source of performance degradation is that loop unrolling > >is enabled when profiling, and loop unrolling is almost always a bad > >pessimization on 32 bits x86 targets. > > To clarify, I was compiling with -funroll-loops and -fpeel-loops > enabled in both cases. > > The FDO slowdown in my case was caused by the presence of some loop > invariant code that was getting removed from the loop by the loop > optimizer pass in the non-FDO case.
you may try adding -fmove-loop-invariants flag, which enables new invariant motion pass. Zdenek