[Bug tree-optimization/68892] New: [6 Regression] Excessive dead loads produced by BB vectorization

rguenth at gcc dot gnu.org Mon, 14 Dec 2015 03:43:44 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68892


            Bug ID: 68892
           Summary: [6 Regression] Excessive dead loads produced by BB
                    vectorization
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

double a[1024][1024];
double b[1024];

void foo(void)
{
  b[0] = a[0][0];
  b[1] = a[1][0];
  b[2] = a[2][0];
  b[3] = a[3][0];
}

is vectorized using

t.c:10:1: note: Load permutation 0 1024 2048 3072
t.c:10:1: note: Final SLP tree for instance:
t.c:10:1: note: node
t.c:10:1: note:         stmt 0 b[0] = _2;
t.c:10:1: note:         stmt 1 b[1] = _4;
t.c:10:1: note:         stmt 2 b[2] = _6;
t.c:10:1: note:         stmt 3 b[3] = _8;
t.c:10:1: note: node
t.c:10:1: note:         stmt 0 _2 = a[0][0];
t.c:10:1: note:         stmt 1 _4 = a[1][0];
t.c:10:1: note:         stmt 2 _6 = a[2][0];
t.c:10:1: note:         stmt 3 _8 = a[3][0];

where our "stupid" load permutation support first loads all vectors of
the group (of size 3073) and then permutes it, using only 4 vectors of it.
For vectors with more than two elements the "need more than two vectors"
part of load permutation support "fixes" this but for two elements nothing
prevents this stupidity (it's all dead code but IVOPTs for example can take
ages processing the dead loads)

  <bb 2>:
  _2 = a[0][0];
  _4 = a[1][0];
  _6 = a[2][0];
  vect__2.5_10 = MEM[(double *)&a];
  _11 = &a[0][0] + 16;
  vect__2.6_12 = MEM[(double *)_11];
  _13 = _11 + 16;
  vect__2.7_14 = MEM[(double *)_13];
  _15 = _13 + 16;
  vect__2.8_16 = MEM[(double *)_15];
...
  _3079 = _3077 + 16;
  vect__2.1540_3080 = MEM[(double *)_3079];
  _3081 = _3079 + 16;
  vect__2.1541_3082 = MEM[(double *)_3081];
  _3083 = _3081 + 18446744073709551608;
  vect__2.1542_3084 = VEC_PERM_EXPR <vect__2.5_10, vect__2.517_1034, { 0, 2 }>;
  vect__2.1543_3085 = VEC_PERM_EXPR <vect__2.1029_2058, vect__2.1541_3082, { 0,
2 }>;
  _8 = a[3][0];
  MEM[(double *)&b] = vect__2.1542_3084;
  _3087 = &b[0] + 16;
  MEM[(double *)_3087] = vect__2.1543_3085;
  return;

[Bug tree-optimization/68892] New: [6 Regression] Excessive dead loads produced by BB vectorization

Reply via email to