It would be helpful to have the patch causing the issue to look at the IL.
But as Micha said, there needs to be a perfect loop nest for interchange
to work.

Richard.

Absolutely!  I'm attaching the reduced testcase, as well as the patch.

The problematic thread shows up in the thread2 dump:

Checking profitability of path (backwards): bb:3 (4 insns) bb:9 (0 insns) bb:5
  Control statement insns: 2
  Overall: 2 insns
Registering FSM jump thread: (5, 9) incoming edge; (9, 3) (3, 8) nocopy; (3, 8)

Thanks.
Aldy
commit 1bf3f76a5ff075396b5b9f5f88d6b18649dac2ce
Author: Aldy Hernandez <al...@redhat.com>
Date:   Sun Sep 5 16:54:00 2021 +0200

    Take into account global ranges when folding statements in path solver.
    
    The path solver used by the backwards threader can refine a folded
    range by taking into account any global range known.  This patch
    intersects the folded range with whatever global information we have.
    
    gcc/ChangeLog:
    
            * gimple-range-path.cc (path_range_query::internal_range_of_expr):
            Intersect with global ranges.
    
    > FAIL: gcc.dg/tree-ssa/loop-interchange-9.c scan-tree-dump-times linterchange "Loop_pair<outer:., inner:.> is interchanged" 1
    > FAIL: gcc.dg/tree-ssa/cunroll-15.c scan-tree-dump optimized "return 1;"

diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
index a4fa3b296ff..c616b65756f 100644
--- a/gcc/gimple-range-path.cc
+++ b/gcc/gimple-range-path.cc
@@ -127,6 +127,9 @@ path_range_query::internal_range_of_expr (irange &r, tree name, gimple *stmt)
   basic_block bb = stmt ? gimple_bb (stmt) : exit_bb ();
   if (stmt && range_defined_in_block (r, name, bb))
     {
+      if (TREE_CODE (name) == SSA_NAME)
+	r.intersect (gimple_range_global (name));
+
       set_cache (r, name);
       return true;
     }
/* { dg-do run } */
/* { dg-options "-O2 -floop-interchange -fdump-tree-linterchange-details" } */
/* { dg-require-effective-target size20plus } */
/* { dg-skip-if "too big data segment" { visium-*-* } } */

#define M 256
int a[M][M], b[M][M], c[M], d[M];
void __attribute__((noinline))
simple_reduc_1 (int n)
{
  for (int j = 0; j < n; j++)
    {
      int sum = c[j];
      for (int i = 0; i < n; i++)
	sum = sum + a[i][j]*b[i][j];

      c[j] = sum;
    }
}

Reply via email to