https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121049

--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think the issue is that we do

  _79 = _78 > { 0, 1, 2, 3, 4, 5, 6, 7 };
  vect__12.20_57 = .MASK_LOAD (vectp_mon_lengths.19_51, 256B, _79, { 0, 0, 0,
0, 0, 0, 0, 0 });
  vect_patt_18.21_58 = WIDEN_MULT_EVEN_EXPR <vect__12.20_57, { 2, 2, 2, 2, 2,
2, 2, 2 }>;
  vect_patt_18.21_59 = WIDEN_MULT_ODD_EXPR <vect__12.20_57, { 2, 2, 2, 2, 2, 2,
2, 2 }>;
  _63 = VIEW_CONVERT_EXPR<vector(4) <signed-boolean:1>>(_79);
  vect_value_4.23_64 = .COND_ADD (_63, vect_patt_18.21_58, _60, _60);
  _65 = VIEW_CONVERT_EXPR<unsigned char>(_79);
  _66 = _65 >> 4;
  _67 = VIEW_CONVERT_EXPR<vector(4) <signed-boolean:1>>(_66);
  vect_value_4.23_68 = .COND_ADD (_67, vect_patt_18.21_59, vect_value_4.23_64,
vect_value_4.23_64);

so we use an even/odd widen mult for the reduction - which ultimatively is OK,
but when we do loop masking it is not, since we the mask the wrong elements.
We'd need an lo/hi widen mult or alternatively do an even/odd extract of the
loop mask instead of taking the lo/hi when distributing.

Reply via email to