https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112902
--- Comment #3 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Jakub Jelinek <ja...@gcc.gnu.org>: https://gcc.gnu.org/g:f32e49add80cb3a22969b12034509d326aa69c5d commit r14-6307-gf32e49add80cb3a22969b12034509d326aa69c5d Author: Jakub Jelinek <ja...@redhat.com> Date: Fri Dec 8 09:03:18 2023 +0100 lower-bitint: Avoid merging non-mergeable stmt with cast and mergeable stmt [PR112902] Before bitint lowering, the IL has: b.0_1 = b; _2 = -b.0_1; _3 = (unsigned _BitInt(512)) _2; a.1_4 = a; a.2_5 = (unsigned _BitInt(512)) a.1_4; _6 = _3 * a.2_5; on the first function. Now, gimple_lower_bitint has an optimization (when not -O0) that it avoids assigning underlying VAR_DECLs for certain SSA_NAMEs where it is possible to lower it in a single loop (or straight line code) rather than in multiple loops. So, e.g. the multiplication above uses handle_operand_addr, which can deal with INTEGER_CST arguments, loads but also casts, so it is fine not to assign an underlying VAR_DECL for SSA_NAMEs a.1_4 and a.2_5, as the multiplication can handle it fine. The more problematic case is the other multiplication operand. It is again a result of a (in this case narrowing) cast, so it is fine not to assign VAR_DECL for _3. Normally we can merge the load (b.0_1) with the negation (_2) and even with the following cast (_3). If _3 was used in a mergeable operation like addition, subtraction, negation, &|^ or equality comparison, all of b.0_1, _2 and _3 could be without underlying VAR_DECLs. The problem is that the current code does that even when the cast is used by a non-mergeable operation, and handle_operand_addr certainly can't handle the mergeable operations feeding the rhs1 of the cast, for multiplication we don't emit any loop in which it could appear, for other operations like shifts or non-equality comparisons we emit loops, but either in the reverse direction or with unpredictable indexes (for shifts). So, in order to lower the above correctly, we need to have an underlying VAR_DECL for either _2 or _3; if we choose _2, then the load and negation would be done in one loop and extension handled as part of the multiplication, if we choose _3, then the load, negation and cast are done in one loop and the multiplication just uses the underlying VAR_DECL computed by that. It is far easier to do this for _3, which is what the following patch implements. It actually already had code for most of it, just it did that for widening casts only (optimize unless the cast rhs1 is not SSA_NAME, or is SSA_NAME defined in some other bb, or with more than one use, etc.). This falls through into such code even for the narrowing or same precision casts, unless the cast is used in a mergeable operation. 2023-12-08 Jakub Jelinek <ja...@redhat.com> PR tree-optimization/112902 * gimple-lower-bitint.cc (gimple_lower_bitint): For a narrowing or same precision cast don't set SSA_NAME_VERSION in m_names only if use_stmt is mergeable_op or fall through into the check that use is a store or rhs1 is not mergeable or other reasons prevent merging. * gcc.dg/bitint-52.c: New test.