On Fri, 4 Nov 2022, Hongyu Wang via Gcc-patches wrote:
This is a follow-up patch for PR98167
The sequence
c1 = VEC_PERM_EXPR (a, a, mask)
c2 = VEC_PERM_EXPR (b, b, mask)
c3 = c1 op c2
can be optimized to
c = a op b
c3 = VEC_PERM_EXPR (c, c, mask)
for all integer vector operation, and float operation with
full permutation.
Hello,
I assume the "full permutation" condition is to avoid performing some
extra operations that would raise exception flags. If so, are there
conditions (-fno-trapping-math?) where the transformation would be safe
with arbitrary shuffles?
--
Marc Glisse