Ping (should hopefully fix one P1) Richard Sandiford <[email protected]> writes: > Artemiy Volkov <[email protected]> writes: >>> Bootstrapped & regression-tested on x86_64-linux-gnu and >>> powerpc64le-linux-gnu. Could you give it a spin on aarch64 too? >> >> No issues on aarch64-linux-gnu either, so AFAIC this should just go in >> as-is. > > OK, great! Thanks for the testing. > > OK to install? I've reattached the patch just in case.
>From 8fce3764039073296af11cd0dce9b482d6c844e4 Mon Sep 17 00:00:00 2001 From: Richard Sandiford <[email protected]> Date: Mon, 23 Feb 2026 16:14:10 +0000 Subject: [PATCH] rtl-ssa: Ensure live-out uses before redefinitions [PR123786] This patch fixes cases in which: (1) a register is live in to an EBB; (2) the register is live out of at least one BB in the EBB; and (3) the register is redefined by a later BB in the same EBB. We were supposed to create live-out uses for (2), so that the redefinition in (3) cannot be moved up into the live range of (1). The patch does this by collecting all definitions in second and subsequence BBs of an EBB. It then creates degenerate phis for those registers that do not naturally need phis. For speed and simplicity, the patch does not check for (2). If a register is live in to the EBB, then it must be used somewhere, either in the EBB itself or in a successor outside of the EBB. A degenerate phi would eventually be needed in either case. This requires moving append_bb earlier, so that add_phi_nodes can iterate over the BBs in an EBB. live_out_value contained an on-the-fly optimisation to remove redundant phis. That was a mistake. live_out_value can be called multiple times for the same quantity. Replacing a phi on-the-fly messes up bookkeeping for second and subsequent calls. The live_out_value optimisation was mostly geared towards memory. As an experiment, I added an assert for when the optimisation applied to registers. It only fired once in an x86_64-linux-gnu bootstrap & regression test, in gcc.dg/tree-prof/split-1.c. That's a very poor (but unsurprising) return. And the optimisation will still be done eventually anyway, during the phi simplification phase. Doing it on the fly was just supposed to allow the phi's memory to be reused. The patch therefore moves the optimisation into add_phi_nodes and restricts it to memory (for which it does make a difference). gcc/ PR rtl-optimization/123786 * rtl-ssa/functions.h (function_info::live_out_value): Delete. (function_info::create_degenerate_phi): New overload. * rtl-ssa/blocks.cc (all_uses_are_live_out_uses): Delete. (function_info::live_out_value): Likewise. (function_info::replace_phi): Keep live-out uses if they are followed by a definition in the same EBB. (function_info::create_degenerate_phi): New overload, extracted from create_reg_use. (function_info::add_phi_nodes): Ensure that there is a phi for every live input that is redefined by a second or subsequent block in the EBB. Record that such phis need live-out uses. (function_info::record_block_live_out): Use look_through_degenerate_phi rather than live_out_value when setting phi inputs. Remove use of live_out_value for live-out uses. Inline the old handling of bb_mem_live_out. (function_info::start_block): Move append_bb call to... (function_info::create_ebbs): ...here. * rtl-ssa/insns.cc (function_info::create_reg_use): Use the new create_degenerate_phi overload. gcc/testsuite/ PR rtl-optimization/123786 * gcc.target/aarch64/pr123786.c: New test. Co-authored-by: Artemiy Volkov <[email protected]> --- gcc/rtl-ssa/blocks.cc | 99 ++++++++++++--------- gcc/rtl-ssa/functions.h | 2 +- gcc/rtl-ssa/insns.cc | 14 ++- gcc/testsuite/gcc.target/aarch64/pr123786.c | 38 ++++++++ 4 files changed, 101 insertions(+), 52 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/pr123786.c diff --git a/gcc/rtl-ssa/blocks.cc b/gcc/rtl-ssa/blocks.cc index d445fcc2cf2..21ea9aa0452 100644 --- a/gcc/rtl-ssa/blocks.cc +++ b/gcc/rtl-ssa/blocks.cc @@ -327,38 +327,6 @@ function_info::add_live_out_use (bb_info *bb, set_info *def) add_use (use); } -// Return true if all nondebug uses of DEF are live-out uses. -static bool -all_uses_are_live_out_uses (set_info *def) -{ - for (use_info *use : def->all_uses ()) - if (!use->is_in_debug_insn () && !use->is_live_out_use ()) - return false; - return true; -} - -// SET, if nonnull, is a definition of something that is live out from BB. -// Return the live-out value itself. -set_info * -function_info::live_out_value (bb_info *bb, set_info *set) -{ - // Degenerate phis only exist to provide a definition for uses in the - // same EBB. The live-out value is the same as the live-in value. - if (auto *phi = safe_dyn_cast<phi_info *> (set)) - if (phi->is_degenerate ()) - { - set = phi->input_value (0); - - // Remove the phi if it turned out to be useless. This is - // mainly useful for memory, because we don't know ahead of time - // whether a block will use memory or not. - if (bb == bb->ebb ()->last_bb () && all_uses_are_live_out_uses (phi)) - replace_phi (phi, set); - } - - return set; -} - // Make USE's definition available at USE, if it isn't already. Assume that // the caller has properly used make_use_available to check that this is // possible. @@ -454,7 +422,11 @@ function_info::replace_phi (phi_info *phi, set_info *new_value) if (new_value) for (use_info *use : phi->nondebug_insn_uses ()) - if (!use->is_live_out_use ()) + // Live-out uses act as a movement barrier for later definitions + // in the same EBB. + if (!use->is_live_out_use () + || (phi->next_def () + && phi->next_def ()->ebb () == phi->ebb ())) { // We need to keep the phi around for its local uses. // Turn it into a degenerate phi, if it isn't already. @@ -543,6 +515,17 @@ function_info::create_phi (ebb_info *ebb, resource_info resource, return phi; } +// Called while building an extended basic block's SSA form using BI. +// Create and return a degenerate phi whose single input is VALUE. +phi_info * +function_info::create_degenerate_phi (build_info &bi, set_info *value) +{ + access_info *inputs[] = { look_through_degenerate_phi (value) }; + auto *phi = create_phi (bi.current_ebb, value->resource (), inputs, 1); + bi.record_reg_def (phi); + return phi; +} + // Create and return a degenerate phi for EBB whose input comes from DEF. // This is used in cases where DEF is known to be available on entry to // EBB but was not previously used within it. If DEF is for a register, @@ -822,6 +805,7 @@ void function_info::add_phi_nodes (build_info &bi) { ebb_info *ebb = bi.current_ebb; + bb_info *bb = ebb->first_bb (); basic_block cfg_bb = ebb->first_bb ()->cfg_bb (); // Create the register phis for this EBB. @@ -846,6 +830,26 @@ function_info::add_phi_nodes (build_info &bi) bitmap_copy (bi.ebb_def_regs, &phis.regs); + if (bb->next_bb ()) + { + // If a live input is redefined by later blocks in the same EBB, we need + // to add live-out uses to prevent the later definition from being moved + // into earlier blocks. Create degenerate phis that can be used for + // that purpose, if phis are not naturally needed. + auto_bitmap extra_regs; + for (auto *bb2 : ebb->bbs ()) + if (bb2 != bb) + bitmap_ior_into (extra_regs, &DF_LR_BB_INFO (bb2->cfg_bb ())->def); + bitmap_and_compl_into (extra_regs, &phis.regs); + EXECUTE_IF_AND_IN_BITMAP (extra_regs, DF_LR_IN (cfg_bb), 0, regno, in_bi) + if (set_info *value = bi.current_reg_value (regno)) + { + create_degenerate_phi (bi, value); + // Record that live-out uses are needed for this phi. + bitmap_set_bit (bi.ebb_def_regs, regno); + } + } + // Collect the live-in memory definitions and record whether they're // all the same. m_temp_defs.reserve (num_preds); @@ -1021,8 +1025,12 @@ function_info::record_block_live_out (build_info &bi) bitmap_iterator out_bi; EXECUTE_IF_SET_IN_BITMAP (&phis.regs, 0, regno, out_bi) { - phis.inputs[input_i] - = live_out_value (bb, bi.current_reg_value (regno)); + set_info *value = bi.current_reg_value (regno); + // Degenerate phis only exist to provide a definition for uses in the + // same EBB. The live-out value is the same as the live-in value. + if (value) + value = look_through_degenerate_phi (value); + phis.inputs[input_i] = value; input_i += 1; } } @@ -1039,7 +1047,7 @@ function_info::record_block_live_out (build_info &bi) bitmap_iterator out_bi; EXECUTE_IF_AND_IN_BITMAP (bi.ebb_def_regs, live_out, 0, regno, out_bi) { - set_info *value = live_out_value (bb, bi.current_reg_value (regno)); + set_info *value = bi.current_reg_value (regno); if (value && value->ebb () == ebb) add_live_out_use (bb, value); } @@ -1059,8 +1067,15 @@ function_info::record_block_live_out (build_info &bi) } // Record the live-out memory value. - bi.bb_mem_live_out[cfg_bb->index] - = live_out_value (bb, bi.current_mem_value ()); + set_info *mem_value = bi.current_mem_value (); + if (auto *phi = safe_dyn_cast <phi_info *> (mem_value)) + if (phi->is_degenerate ()) + { + mem_value = phi->input_value (0); + if (bb == ebb->last_bb () && !phi->has_nondebug_uses ()) + replace_phi (phi, mem_value); + } + bi.bb_mem_live_out[cfg_bb->index] = mem_value; } // Add BB and its contents to the SSA information. @@ -1079,9 +1094,6 @@ function_info::start_block (build_info &bi, bb_info *bb) // Record the start of this block's definitions in the definitions stack. bi.old_def_stack_limit.safe_push (bi.def_stack.length ()); - // Add the block itself. - append_bb (bb); - // If the block starts an EBB, create the phi insn. This insn should exist // for all EBBs, even if they don't (yet) need phis. if (bb == ebb->first_bb ()) @@ -1299,7 +1311,10 @@ function_info::create_ebbs (build_info &bi) // Create the EBB itself. auto *ebb = allocate<ebb_info> (bbs[0], bbs.last ()); for (bb_info *bb : bbs) - bb->set_ebb (ebb); + { + bb->set_ebb (ebb); + append_bb (bb); + } bbs.truncate (0); } diff --git a/gcc/rtl-ssa/functions.h b/gcc/rtl-ssa/functions.h index 63901a660da..9417046230e 100644 --- a/gcc/rtl-ssa/functions.h +++ b/gcc/rtl-ssa/functions.h @@ -308,7 +308,6 @@ private: void add_reg_unused_notes (insn_info *); void add_live_out_use (bb_info *, set_info *); - set_info *live_out_value (bb_info *, set_info *); void commit_make_use_available (use_info *); void append_phi (ebb_info *, phi_info *); @@ -336,6 +335,7 @@ private: void create_ebbs (build_info &); void add_entry_block_defs (build_info &); void calculate_ebb_live_in_for_debug (build_info &); + phi_info *create_degenerate_phi (build_info &, set_info *); void add_phi_nodes (build_info &); void add_artificial_accesses (build_info &, df_ref_flags); void add_block_contents (build_info &); diff --git a/gcc/rtl-ssa/insns.cc b/gcc/rtl-ssa/insns.cc index 317a5809b18..b42acb9704b 100644 --- a/gcc/rtl-ssa/insns.cc +++ b/gcc/rtl-ssa/insns.cc @@ -482,15 +482,11 @@ function_info::create_reg_use (build_info &bi, insn_info *insn, if (insn->is_debug_insn ()) value = look_through_degenerate_phi (value); else if (bitmap_bit_p (bi.potential_phi_regs, resource.regno)) - { - // VALUE is defined by a previous EBB and RESOURCE has multiple - // definitions. Create a degenerate phi in the current EBB - // so that all definitions and uses follow a linear RPO view; - // see rtl.texi for details. - access_info *inputs[] = { look_through_degenerate_phi (value) }; - value = create_phi (bi.current_ebb, value->resource (), inputs, 1); - bi.record_reg_def (value); - } + // VALUE is defined by a previous EBB and RESOURCE has multiple + // definitions. Create a degenerate phi in the current EBB + // so that all definitions and uses follow a linear RPO view; + // see rtl.texi for details. + value = create_degenerate_phi (bi, value); } auto *use = allocate<use_info> (insn, resource, value); add_use (use); diff --git a/gcc/testsuite/gcc.target/aarch64/pr123786.c b/gcc/testsuite/gcc.target/aarch64/pr123786.c new file mode 100644 index 00000000000..1b59f35949c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr123786.c @@ -0,0 +1,38 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -mtune=generic-armv9-a" } */ + +#include <arm_neon.h> + +int g_36[3]; +short g_s; +signed char g_82, g_179, g_x, g_101; + +__attribute__((noipa)) +void check(void) +{ + if (g_36[0] != 1) + __builtin_abort (); +} + +int main(void) +{ +BS_LABEL_3: + short l_65 = g_s; + if (g_x) goto BS_LABEL_6; + for (; g_101; g_101 = 5) + { + BS_LABEL_6: + } + for (; g_82 < 1; g_82++) + { + switch (vqadds_u32 (0, 0)) + { + case 4: goto BS_LABEL_3; + case 9: goto BS_LABEL_3; + } + short si1 = (l_65 &= 1) || (g_179 &= 0) != 6; + g_36[0] = g_36[2] == 0 ? 1 : si1 / g_36[2]; + } + + check (); +} -- 2.53.0
