https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113485
--- Comment #8 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The trunk branch has been updated by Richard Sandiford <rsand...@gcc.gnu.org>: https://gcc.gnu.org/g:f251bbfec9174169510b2dec14b9bf763e7b77af commit r14-8420-gf251bbfec9174169510b2dec14b9bf763e7b77af Author: Richard Sandiford <richard.sandif...@arm.com> Date: Thu Jan 25 12:03:17 2024 +0000 aarch64: Avoid paradoxical subregs in UXTL split [PR113485] g:74e3e839ab2d36841320 handled the UXTL{,2}-ZIP[12] optimisation in split1. The UXTL input is a 64-bit vector of N-bit elements and the result is a 128-bit vector of 2N-bit elements. The corresponding ZIP1 operates on 128-bit vectors of N-bit elements. This meant that the ZIP1 input had to be a 128-bit paradoxical subreg of the 64-bit UXTL input. In the PRs, it wasn't possible to generate this subreg because the inputs were already subregs of a x[234] structure of 64-bit vectors. I don't think the same thing can happen for UXTL2->ZIP2 because UXTL2 input is a 128-bit vector rather than a 64-bit vector. It isn't really necessary for ZIP1 to take 128-bit inputs, since the upper 64 bits are ignored. This patch therefore adds a pattern for 64-bit รข 128-bit ZIP1s. In principle, we should probably use this form for all ZIP1s. But in practice, that creates an awkward special case, and would be quite invasive for stage 4. gcc/ PR target/113485 * config/aarch64/aarch64-simd.md (aarch64_zip1<mode>_low): New pattern. (<optab><Vnarrowq><mode>2): Use it instead of generating a paradoxical subreg for the input. gcc/testsuite/ PR target/113485 * gcc.target/aarch64/pr113485.c: New test. * gcc.target/aarch64/pr113573.c: Likewise.