[Bug tree-optimization/113592] missed partial sum optimization in vectorizer

rguenth at gcc dot gnu.org via Gcc-bugs Thu, 25 Jan 2024 01:57:58 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113592


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-*-*

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
The vectorizer for the original testcase generates

  # vect_sum_20.8_49 = PHI <vect_sum_16.21_75(6), { 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0 }(9)>
...
  vect__9.20_68 = vect__5.12_55 * vect__8.16_61;
  vect__9.20_69 = vect__5.12_56 * vect__8.17_63;
  vect__9.20_70 = vect__5.12_57 * vect__8.18_65;
  vect__9.20_71 = vect__5.12_58 * vect__8.19_67;
  _9 = _5 * _8;
  vect_sum_16.21_72 = vect__9.20_68 + vect_sum_20.8_49;
  vect_sum_16.21_73 = vect__9.20_69 + vect_sum_16.21_72;
  vect_sum_16.21_74 = vect__9.20_70 + vect_sum_16.21_73;
  vect_sum_16.21_75 = vect__9.20_71 + vect_sum_16.21_74;
  sum_16 = _9 + sum_20;

the adds are from the optimization to reduce the number of reduction IVs
(we could alternatively keep them independent with 4 IVs and handle the
reducing in the epilogue).  This is to reduce register pressure.

But this also shows if the issue isn't the multiple IVs, that this could
be handled by reassoc + FMA forming given the vectorizer itself doesn't
produce FMAs here.

[Bug tree-optimization/113592] missed partial sum optimization in vectorizer

Reply via email to