PR 36599

Jack Howarth Sun, 22 Jun 2008 17:54:08 -0700

Richard,
     There is a regression in the induct polyhedron benchmark execution
when gfortran compiled with -ffast-math -O3 introduced with gcc 4.3
that isn't present in gcc 4.2.4.


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36599

According to the SuSe polyhedron benchmark servers, the bottleneck
in the induct benchmark was eliminated between 2008-04-27 and
2008-04-28 in gcc trunk. My guess is that your changes...

r134730 | rguenth | 2008-04-27 12:27:08 -0400 (Sun, 27 Apr 2008) | 42 lines

2008-04-27  Richard Guenther  <[EMAIL PROTECTED]>

        PR tree-optimization/18754
        PR tree-optimization/34223
        * tree-pass.h (pass_complete_unrolli): Declare.
        * tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Print
        loop size before and after unconditionally of UL_NO_GROWTH in effect.
        Rewrite loop into loop closed SSA form if it is not already.
        (tree_unroll_loops_completely): Re-structure to iterate over
        innermost loops with intermediate CFG cleanups.
        Unroll outermost loops only if requested or the code does not grow
        doing so.
        * tree-ssa-loop.c (gate_tree_vectorize): Don't shortcut if no
        loops are available.
        (tree_vectorize): Instead do so here.
        (tree_complete_unroll): Also unroll outermost loops.
        (tree_complete_unroll_inner): New function.
        (gate_tree_complete_unroll_inner): Likewise.
        (pass_complete_unrolli): New pass.
        * tree-ssa-loop-manip.c (find_uses_to_rename_use): Only record
        uses outside of the loop.
        (tree_duplicate_loop_to_header_edge): Only verify loop-closed SSA
        form if it is available.  
        * tree-flow.h (tree_unroll_loops_completely): Add extra parameter.
        * passes.c (init_optimization_passes): Schedule complete inner
        loop unrolling pass before the first CCP pass after final inlining.

        * gcc.dg/tree-ssa/loop-36.c: New testcase.
        * gcc.dg/tree-ssa/loop-37.c: Likewise.
        * gcc.dg/vect/vect-118.c: Likewise.
        * gcc.dg/Wunreachable-8.c: XFAIL bogus warning.
        * gcc.dg/vect/vect-66.c: Increase loop trip count.
        * gcc.dg/vect/no-section-anchors-vect-66.c: Likewise.
        * gcc.dg/vect/no-section-anchors-vect-69.c: Likewise.
        * gcc.dg/vect/vect-76.c: Likewise.
        * gcc.dg/vect/vect-outer-6.c: Likewise.
        * gcc.dg/vect/vect-outer-1.c: Likewise.
        * gcc.dg/vect/vect-outer-1a.c: Likewise.
        * gcc.dg/vect/vect-11a.c: Likewise.
        * gcc.dg/vect/vect-shift-1.c: Likewise.
        * gcc.target/i386/vectorize1.c: Likewise.


...solved this performance regression. Is it possible that these changes could
be backported to gcc 4.3.2 to eliminate the performance regression in gcc 4.3
compared to gcc 4.2?
             Jack

PR 36599

Reply via email to