https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84935

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Seems it actually is vectorized, probably just using DImode vectors for
2xSImode,
and dom doesn't handle vector stores followed by scalar loads.  Before
store-merging the dump is:
  MEM[(int *)&a] = { 0, 1 };
  MEM[(int *)&a + 8B] = { 4, 9 };
  MEM[(int *)&a + 16B] = { 16, 25 };
  MEM[(int *)&a + 24B] = { 36, 49 };
  MEM[(int *)&a + 32B] = { 64, 81 };
  _6 = a[0];
  _28 = a[1];
  res_29 = _6 + _28;
  _35 = a[2];
  res_36 = res_29 + _35;
  _42 = a[3];
  res_43 = res_36 + _42;
  _49 = a[4];
  res_50 = res_43 + _49;
  _56 = a[5];
  res_57 = res_50 + _56;
  _63 = a[6];
  res_64 = res_57 + _63;
  _70 = a[7];
  res_71 = res_64 + _70;
  _77 = a[8];
  res_78 = res_71 + _77;
  _2 = a[9];
  res_11 = _2 + res_78;
  a ={v} {CLOBBER};
  return res_11;
and nothing really changes till *.optimized in it.

Reply via email to