https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117093
--- Comment #6 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Jennifer Schmitz <jschm...@gcc.gnu.org>: https://gcc.gnu.org/g:c83e2d47574fd9a21f257e0f0d7e350c3f1b0618 commit r15-5324-gc83e2d47574fd9a21f257e0f0d7e350c3f1b0618 Author: Jennifer Schmitz <jschm...@nvidia.com> Date: Mon Nov 4 07:56:09 2024 -0800 match.pd: Fold vec_perm with view_convert This patch improves the codegen for the following test case: uint64x2_t foo (uint64x2_t r) { uint32x4_t a = vreinterpretq_u32_u64 (r); uint32_t t; t = a[0]; a[0] = a[1]; a[1] = t; t = a[2]; a[2] = a[3]; a[3] = t; return vreinterpretq_u64_u32 (a); } from (-O1): foo: mov v31.16b, v0.16b ins v0.s[0], v0.s[1] ins v0.s[1], v31.s[0] ins v0.s[2], v31.s[3] ins v0.s[3], v31.s[2] ret to: foo: rev64 v0.4s, v0.4s ret This is achieved by extending the following match.pd pattern to account for type differences between @0 and @1 due to view converts. /* Simplify vector inserts of other vector extracts to a permute. */ (simplify (bit_insert @0 (BIT_FIELD_REF@2 @1 @rsize @rpos) @ipos) The patch was bootstrapped and regtested on aarch64-linux-gnu and x86_64-linux-gnu, no regression. OK for mainline? Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com> Co-authored-by: Richard Biener <rguent...@suse.de> gcc/ PR tree-optimization/117093 * match.pd: Extend (bit_insert @0 (BIT_FIELD_REF@2 @1 @rsize @rpos) @ipos) to allow type differences between @0 and @1 due to view converts. gcc/testsuite/ PR tree-optimization/117093 * gcc.dg/tree-ssa/pr117093.c: New test.