[component might be wrong] The appended test case is significantly faster with -Os -funroll-all-loops (~5%) versus -O2 -funroll-all-loops in gcc 4.4 ( gcc version 4.4.0 20080829; that is shortly after the IRA merge) on a Core2 (Merom)
In earlier gcc versions they are about the same performance. The -Os improvement is against all earlier versions (good!) but it should be in -O2 too. I tried -fno-tree-pre as it was suggested and it didn't make a difference. -- Summary: -Os significantly faster than -O2 on test case Product: gcc Version: 4.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: andi-gcc at firstfloor dot org GCC host triplet: x86_64-linux GCC target triplet: x86-64-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37312