> Do you have specific testcase? It would be interesting to see if new > optimizer can catch up at least on kill-loop branch.
Here is a simplified version of what I observed. In the non-FDO case, the loop invariant load of the constant 32 is removed from the loop. When FDO is enabled, the load remains in the loop. float farray[100]; int main (int argc, char *argv[]) { int m; for( m = 0; m < 100; m++ ) { farray[m] = 32; } } I'm compiling it as follows using a version of gcc built from mainline yesterday. Non-FDO: gcc -O3 -funroll-loops -fpeel-loops -o test test.c FDO: gcc -O3 -funroll-loops -fpeel-loops -fprofile-generate -o test test.c ./test gcc -O3 -funroll-loops -fpeel-loops -fprofile-use -o test test.c Pete