https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117048
--- Comment #10 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Kyrylo Tkachov <ktkac...@gcc.gnu.org>: https://gcc.gnu.org/g:1c46a541c6957e8b0eee339d4cff46e951a5ad4e commit r15-4889-g1c46a541c6957e8b0eee339d4cff46e951a5ad4e Author: Kyrylo Tkachov <ktkac...@nvidia.com> Date: Mon Nov 4 07:25:16 2024 -0800 PR 117048: simplify-rtx: Simplify (X << C1) [+,^] (X >> C2) into ROTATE This is, in effect, a reapplication of de2bc6a7367aca2eecc925ebb64cfb86998d89f3 fixing the compile-time hog in var-tracking due to calling simplify_rtx on the two arms of the rotation before detecting the ROTATE. That is not necessary. simplify-rtx can transform (X << C1) | (X >> C2) into ROTATE (X, C1) when C1 + C2 == mode-width. But the transformation is also valid for PLUS and XOR. Indeed GIMPLE can also do the fold. Let's teach RTL to do it too. The motivating testcase for this is in AArch64 intrinsics: uint64x2_t G2(uint64x2_t a, uint64x2_t b) { uint64x2_t c = veorq_u64(a, b); return veorq_u64(vaddq_u64(c, c), vshrq_n_u64(c, 63)); } which I was hoping to fold to a single XAR (a ROTATE+XOR instruction) but GCC was failing to detect the rotate operation for two reasons: 1) The combination of the two arms of the expression is done under XOR rather than IOR that simplify-rtx currently supports. 2) The ASHIFT operation is actually a (PLUS X X) operation and thus is not detected as the LHS of the two arms we require. The patch fixes both issues. The analysis of the two arms of the rotation expression is factored out into a common helper simplify_rotate_op which is then used in the PLUS, XOR, IOR cases in simplify_binary_operation_1. The check-assembly testcase for this is added in the following patch because it needs some extra AArch64 backend work, but I've added self-tests in this patch to validate the transformation. Bootstrapped and tested on aarch64-none-linux-gnu Signed-off-by: Kyrylo Tkachov <ktac...@nvidia.com> PR target/117048 * simplify-rtx.cc (extract_ashift_operands_p): Define. (simplify_rotate_op): Likewise. (simplify_context::simplify_binary_operation_1): Use the above in the PLUS, IOR, XOR cases. (test_vector_rotate): Define. (test_vector_ops): Use the above.