https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116583
--- Comment #14 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The trunk branch has been updated by Richard Sandiford <rsand...@gcc.gnu.org>: https://gcc.gnu.org/g:8157f3f2d211bfbf53fbf8dd209b47ce583f4142 commit r15-4114-g8157f3f2d211bfbf53fbf8dd209b47ce583f4142 Author: Richard Sandiford <richard.sandif...@arm.com> Date: Mon Oct 7 13:03:04 2024 +0100 vect: Support more VLA SLP permutations [PR116583] This is the main patch for PR116583. Previously, we only supported VLA SLP permutations for which the output and inputs have the same number of lanes, and for which that number of lanes divides the number of vector elements. The patch extends this to handle: (1) "packs" of a single 2N-vector input into an N-vector output (2) "unpacks" of N-vector inputs into an XN-vector output Hopefully the comments in the code explain the approach. The contents of the: for (unsigned i = 0; i < ncopies; ++i) loop do not change; the patch simply adds an outer loop around it. The patch removes the XFAIL in slp-13.c and also improves the SVE vect.exp results with vect-force-slp=1. I haven't added new tests specifically for this, since presumably the existing ones will cover it once the SLP switch is flipped. gcc/ PR tree-optimization/116583 * tree-vect-slp.cc (vectorizable_slp_permutation_1): Handle variable-length pack and unpack permutations. gcc/testsuite/ PR tree-optimization/116583 * gcc.dg/vect/slp-13.c: Remove xfail for vect_variable_length. * gcc.dg/vect/slp-13-big-array.c: Likewise.