On 7/26/23 07:27, Richard Biener via Gcc-patches wrote:
The following patch makes sure to elide a redundant permute that
can be merged with existing splats represented as load permutations
as we now do for non-grouped SLP loads. This is the last bit
missing to fix this PR where the main fix was already done by
r14-2117-gdd86a5a69cbda4
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/106081
* tree-vect-slp.cc (vect_optimize_slp_pass::start_choosing_layouts):
Assign layout -1 to splats.
* gcc.dg/vect/pr106081.c: New testcase.
:-) Glad to see how easy this ended up being after the work you put
into pushing permutes around a couple years ago.
jeff