https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102160

            Bug ID: 102160
           Summary: Too many runtime alias checks when vectorizing
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

The following is reduced from (or rather "inspired") 507.cactuBSSN_r
ML_BSSN_Advect_Body where, when one works around other issues by editing the
source, the vectorizer intends to create > 8000 runtime alias checks (and
refuses).

void foo (double *a, double *b, int off, int n, int m)
{
  for (int j = 0; j < m; ++j)
    for (int i = 0; i < n; ++i)
      a[j*n+i] = b[j*n+i] + b[(j+1)*n+i] + b[(j-1)*n+i];
}

this small example iterates over a 2d array in a linearized way
(and a way that as written does not actually guarantee that each
a[j*n + i] is only written once, that is, the 2 dimensions do not "overlap").

The interesting bit is that the kernel offsets the accesses in the outer loop
iteration direction and thus when analyzing the refs in the innermost loop
we have three unknown non-constant offsets to b[] and we will create three
runtime alias checks that fail to merge (obviously).

We need to do better by formulating the alias checks with respect to the
outermost [interesting] iteration where we should be able to merge the
checks into one, obviously making it less precise by computing the access
extent of the whole loop nest.

As additional benefit the runtime alias check can be hoisted and thus
versioning applied to the outer loop.  That might already magically
work even.

Reply via email to