https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90796

--- Comment #7 from Michael Matz <matz at gcc dot gnu.org> ---
Created attachment 46675
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46675&action=edit
potential patch

Actually I was barking up the wrong tree.  It's not as easy as the CFG
manipulation for loop fusion going wrong (like missing some last iterations
or so).  It's really a problem in the dependence analysis.  See the extensive
comment in the patch.

The fun thing is, there's a difference between these two loop nests:

   for (i) for (j) a[i][0] = f(a[i+1][0]);
   for (i) for (j) b[i][j] = f(a[i+1][j]);

Even though the distance vector for the read/write in the single statement
is (-1,0) for both loops, unroll-and-jam is valid for the second but not
for the first.

Reply via email to