Re: [PATCH] Fix unroll-and-jam (PR tree-optimization/87360)

Richard Biener Wed, 05 Dec 2018 01:28:34 -0800

On Wed, 5 Dec 2018, Jakub Jelinek wrote:

> Hi!
> 
> The following testcases ICE, because tree_loop_unroll_and_jam optimizes one
> loop and on another one after it fails to analyze data dependencies and
> returns.  The end effect of that is that neither the code at the end of the
> function to do final cleanups, nor TODO_cleanup_cfg is done, and the latter
> means that we don't reset some number of initialization stuff that keeps
> referencing already removed stmts.
> 
> The following patch fixes that by replacing return false; with continue;,
> so that it will just ignore the loop which couldn't be optimized and will
> try to optimize other further loops and at the end if at least one loop is
> optimized, return TODO_cleanup_cfg.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.

Thanks,
Richard.

> 2018-12-04  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR tree-optimization/87360
>       * gimple-loop-jam.c (tree_loop_unroll_and_jam): On failure to analyze
>       data dependencies, don't return false, just continue.  Formatting
>       fixes.
>       (merge_loop_tree, bb_prevents_fusion_p, unroll_jam_possible_p,
>       fuse_loops): Formatting fixes.
> 
>       * g++.dg/opt/pr87360.C: New test.
>       * gfortran.dg/pr87360.f90: New test.
> 
> --- gcc/gimple-loop-jam.c.jj  2018-09-04 19:48:19.238613844 +0200
> +++ gcc/gimple-loop-jam.c     2018-12-04 18:53:45.983850575 +0100
> @@ -118,7 +118,7 @@ merge_loop_tree (struct loop *loop, stru
>    for (i = 0; i < n; i++)
>      {
>        /* If the block was direct child of OLD loop it's now part
> -         of LOOP.  If it was outside OLD, then it moved into LOOP
> +      of LOOP.  If it was outside OLD, then it moved into LOOP
>        as well.  This avoids changing the loop father for BBs
>        in inner loops of OLD.  */
>        if (bbs[i]->loop_father == old
> @@ -167,7 +167,7 @@ bb_prevents_fusion_p (basic_block bb)
>         * stores or unknown side-effects prevent fusion
>         * loads don't
>         * computations into SSA names: these aren't problematic.  Their
> -         result will be unused on the exit edges of the first N-1 copies
> +      result will be unused on the exit edges of the first N-1 copies
>        (those aren't taken after unrolling).  If they are used on the
>        other edge (the one leading to the outer latch block) they are
>        loop-carried (on the outer loop) and the Nth copy of BB will
> @@ -282,12 +282,12 @@ unroll_jam_possible_p (struct loop *oute
>        if (!simple_iv (loop, loop, op, &iv, true))
>       return false;
>        /* The inductions must be regular, loop invariant step and initial
> -         value.  */
> +      value.  */
>        if (!expr_invariant_in_loop_p (outer, iv.step)
>         || !expr_invariant_in_loop_p (outer, iv.base))
>       return false;
>        /* XXX With more effort we could also be able to deal with inductions
> -         where the initial value is loop variant but a simple IV in the
> +      where the initial value is loop variant but a simple IV in the
>        outer loop.  The initial value for the second body would be
>        the original initial value plus iv.base.step.  The next value
>        for the fused loop would be the original next value of the first
> @@ -322,7 +322,7 @@ fuse_loops (struct loop *loop)
>        gcc_assert (EDGE_COUNT (next->header->preds) == 1);
>  
>        /* The PHI nodes of the second body (single-argument now)
> -         need adjustments to use the right values: either directly
> +      need adjustments to use the right values: either directly
>        the value of the corresponding PHI in the first copy or
>        the one leaving the first body which unrolling did for us.
>  
> @@ -449,13 +449,13 @@ tree_loop_unroll_and_jam (void)
>        dependences.create (10);
>        datarefs.create (10);
>        if (!compute_data_dependences_for_loop (outer, true, &loop_nest,
> -                                            &datarefs, &dependences))
> +                                           &datarefs, &dependences))
>       {
>         if (dump_file && (dump_flags & TDF_DETAILS))
>           fprintf (dump_file, "Cannot analyze data dependencies\n");
>         free_data_refs (datarefs);
>         free_dependence_relations (dependences);
> -       return false;
> +       continue;
>       }
>        if (!datarefs.length ())
>       continue;
> @@ -490,7 +490,7 @@ tree_loop_unroll_and_jam (void)
>                                    &removed))
>           {
>             /* Couldn't get the distance vector.  For two reads that's
> -              harmless (we assume we should unroll).  For at least
> +              harmless (we assume we should unroll).  For at least
>                one write this means we can't check the dependence direction
>                and hence can't determine safety.  */
>  
> @@ -503,7 +503,7 @@ tree_loop_unroll_and_jam (void)
>       }
>  
>        /* We regard a user-specified minimum percentage of zero as a request
> -         to ignore all profitability concerns and apply the transformation
> +      to ignore all profitability concerns and apply the transformation
>        always.  */
>        if (!PARAM_VALUE (PARAM_UNROLL_JAM_MIN_PERCENT))
>       profit_unroll = 2;
> --- gcc/testsuite/g++.dg/opt/pr87360.C.jj     2018-12-04 18:50:57.236590929 
> +0100
> +++ gcc/testsuite/g++.dg/opt/pr87360.C        2018-12-04 18:50:20.236191816 
> +0100
> @@ -0,0 +1,27 @@
> +// PR tree-optimization/87360
> +// { dg-do compile { target size32plus } }
> +// { dg-options "-O3 -fno-tree-dce --param unroll-jam-min-percent=0" }
> +
> +void abort (void);
> +
> +void foo (int N)
> +{
> +  int i, j;
> +  int x[1000][1000];
> +
> +  for (i = 0; i < N; i++)
> +    for (j = 0; j < N; j++)
> +      x[i][j] = i + j + 3;
> +
> +  for (i = 0; i < N; i++)
> +    for (j = 0; j < N; j++)
> +      if (x[i][j] != i + j + 3)
> +     abort ();
> +}
> +
> +int main(void)
> +{
> +  foo (1000);
> +
> +  return 0;
> +}
> --- gcc/testsuite/gfortran.dg/pr87360.f90.jj  2018-12-04 18:52:34.321014295 
> +0100
> +++ gcc/testsuite/gfortran.dg/pr87360.f90     2018-12-04 18:52:16.700300433 
> +0100
> @@ -0,0 +1,5 @@
> +! PR tree-optimization/87360
> +! { dg-do compile }
> +! { dg-options "-fno-tree-dce -O3 --param max-completely-peeled-insns=0" }
> +
> +include 'function_optimize_2.f90'
> 
>       Jakub
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Fix unroll-and-jam (PR tree-optimization/87360)

Reply via email to