On Tue, Feb 24, 2026 at 4:09 AM Richard Sandiford
<[email protected]> wrote:
>
> Artemiy Volkov <[email protected]> writes:
> > On Mon, Feb 23, 2026 at 09:18:01PM +0000, Richard Sandiford wrote:
> >> Artemiy Volkov <[email protected]> writes:
> >> > On Wed, Feb 04, 2026 at 12:32:11AM +0000, Richard Sandiford wrote:
> >> >> Artemiy Volkov <[email protected]> writes:
> >> >> > [...]
> >> >> > To address this, during initial traversal of a new EBB, we create an
> >> >> > artificial live-out use at the bottom of the current BB when (a) a
> >> >> > variable is live-in at the entry to the EBB, (b) it hasn't yet been
> >> >> > redefined in the current EBB, and (c) it is redefined in the next BB.
> >> >> > This live-out use serves as a barrier that restricts movement of
> >> >> > new defs
> >> >> > from later BBs having a new value to earlier ones still having the old
> >> >> > live-in value.  In the diagram, this live-out use will be created
> >> >> > for r110
> >> >> > at the end of bb10.
> >> >> >
> >> >> > Since the def that this newly created live-out use sees has to be
> >> >> > dominating (otherwise whenever the last use occurs after the
> >> >> > current EBB,
> >> >> > the movement will be restricted to an empty range), we create a
> >> >> > new def at
> >> >> > the beginning of the EBB.  For this, we use the create_reg_use ()
> >> >> > function
> >> >> > that will also create a degenerate PHI for this purpose when needed.
> >> >> >
> >> >> > One problem that we run into with this approach is that during RTL-SSA
> >> >> > construction single-input PHI nodes are sometimes eliminated as
> >> >> > unnecessary, which causes the last dominating def to be reverted to an
> >> >> > earlier EBB and the move range restriction failing as before.  To 
> >> >> > remedy
> >> >> > this, we mark PHIs that we create as 'persistent' and do not eliminate
> >> >> > them where previously they would be replaced by their single input.  
> >> >> > (To
> >> >> > remain aligned with the design goal of minimizing memory overhead that
> >> >> > RTL-SSA incurs, the m_has_been_superceded flag has been repurposed to
> >> >> > track this in the phi_info class.)
> >> >> >
> >> >> > Bootstrapped and regtested on aarch64-linux-gnu; no performance
> >> >> > regressions across SPEC2017.  New testcase added as provided by Alex
> >> >> > Coplan in the BZ, many thanks for that.
> >> >>
> >> >> Thanks for looking at this and for the nice explanation.
> >> >> I agree that the underlying problem is the lack of a live-out use.
> >> >> However, rather than add a separate mechanism for this case, I think we
> >> >> should instead force a phi to be created from the outset.  That will
> >> >> make sure that EBBs with a single dominating live-in definition
> >> >> (as here) are handled similarly to EBBs with potentially multiple
> >> >> incoming live-in definitions.
> >> >>
> >> >> Also, rather than add a flag to the phi, I think we can test whether
> >> >> the next definition is in the same EBB.  This should also ensure that we
> >> >> handle EBBs that start out with potentially multiple incoming live-in
> >> >> definitions but eventually simplify to a single value.
> >> >
> >> > Hi Richard, apologies for the slow reply.
> >> >
> >> > I obviously agree that your approach is the right one to take.  I can
> >> > confirm that your patch fixes the original testcase; what is still 
> >> > missing
> >> > for it to pass bootstrap/regtest is handling of chains of degenerate PHIs
> >> > which didn't occur before.  Where before your patch phi->input_value (0)
> >> > would always point to a "real" definition (either an insn or a
> >> > non-degenerate PHI), now a CFG can be conceived where you'd have to
> >> > traverse a chain of (non-simplified) degenerate PHIs to get to that
> >> > definition.
> >>
> >> Argh.  I should have predicted that, sorry.
> >>
> >> > I've made two main fixes on top of your patch to cater to this new
> >> > possibility:  (1) in function_info::live_out_value (), when one 
> >> > degenerate
> >> > PHI is feeding into another, we need to use the former as the input for
> >> > the latter, as otherwise we'd be setting the latter's def to NULL; (2) in
> >> > look_through_degenerate_phi (), we need to walk the PHI chain as 
> >> > explained
> >> > before (this also necessitates a small assertion change in
> >> > function_info::make_use_available ()).  There will be a performance
> >> > penalty associated with this walk in the worst case, but I believe that
> >> > this should not be a problem on real-life inputs.
> >>
> >> I think look_through_degenerate_phi really does need to be O(1),
> >> otherwise we risk quadratic complexity.
> >
> > Right, perhaps this needed to be a separate get_ultimate_def () function
> > just for the needs of that one checking assert in
> > function_info::make_use_available () ...
> >
> >>
> >> So how about this compromise?  It goes back to your original approach
> >> of detecting the problematic case while building the EBB.  But rather
> >> than add the phis on-the-fly in record_block_live_out, the patch adds
> >> them during add_phi_nodes.
> >>
> >> This preserves the property that I was trying to get from the proposal
> >> above, namely that we would add the same live-out uses as we would for
> >> a normal phi.
> >>
> >> For example, it means that if a register is live out from two blocks
> >> in the EBB, we'd add live-out uses for both blocks.  It also means that
> >> if:
> >>
> >> - an EBB contains three blocks
> >> - the register is live out of the first block and dies at that point
> >> - the register is not referenced in the second block
> >> - the register is defined in the third block
> >>
> >> the first block would get the live-out use, rather than the second.
> >> It would still be possible to move the definition up to the second
> >> block, if that proves to be useful.
> >
> > I think the key part here that tripped me up were calls to
> > replace_phi () during the dominator walk.  With those out of the way, it's
> > certainly much easier to reason about the order in which things
> > happen, and both approaches (add_phi_nodes ()- vs. place_phis ()-time)
> > can probably be made to work.
> >
> >>
> >> Bootstrapped & regression-tested on x86_64-linux-gnu and
> >> powerpc64le-linux-gnu.  Could you give it a spin on aarch64 too?
> >
> > No issues on aarch64-linux-gnu either, so AFAIC this should just go in
> > as-is.
>
> OK, great!  Thanks for the testing.
>
> OK to install?  I've reattached the patch just in case.

Ok.

>
> Richard
>
>

Reply via email to