https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117323

--- Comment #4 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
Another miss optimization is GCC failed to recognize max_expr for sum1, which
generates a lot pack/unpack code in the vectorizer

  prephitmp_66 = (int) _8;
  # DEBUG a => NULL
  # DEBUG b => NULL
  # DEBUG a => NULL
  # DEBUG b => NULL
  # DEBUG INLINE_ENTRY max
  _35 = (unsigned int) prephitmp_65;
  _9 = (unsigned int) _8;
  _10 = _35 * _9;
  _72 = (int) _10;
  _74 = _72 / 128;
  _76 = (char) _74;
  _42 = prephitmp_66 > 0;
  prephitmp_77 = _42 ? _76 : 0;

Yes, swap the operand order generates much decent code for x86, but make arm
generate worse code.

Reply via email to