On Tue, Feb 24, 2026 at 4:09 AM Richard Sandiford <[email protected]> wrote: > > Artemiy Volkov <[email protected]> writes: > > On Mon, Feb 23, 2026 at 09:18:01PM +0000, Richard Sandiford wrote: > >> Artemiy Volkov <[email protected]> writes: > >> > On Wed, Feb 04, 2026 at 12:32:11AM +0000, Richard Sandiford wrote: > >> >> Artemiy Volkov <[email protected]> writes: > >> >> > [...] > >> >> > To address this, during initial traversal of a new EBB, we create an > >> >> > artificial live-out use at the bottom of the current BB when (a) a > >> >> > variable is live-in at the entry to the EBB, (b) it hasn't yet been > >> >> > redefined in the current EBB, and (c) it is redefined in the next BB. > >> >> > This live-out use serves as a barrier that restricts movement of > >> >> > new defs > >> >> > from later BBs having a new value to earlier ones still having the old > >> >> > live-in value. In the diagram, this live-out use will be created > >> >> > for r110 > >> >> > at the end of bb10. > >> >> > > >> >> > Since the def that this newly created live-out use sees has to be > >> >> > dominating (otherwise whenever the last use occurs after the > >> >> > current EBB, > >> >> > the movement will be restricted to an empty range), we create a > >> >> > new def at > >> >> > the beginning of the EBB. For this, we use the create_reg_use () > >> >> > function > >> >> > that will also create a degenerate PHI for this purpose when needed. > >> >> > > >> >> > One problem that we run into with this approach is that during RTL-SSA > >> >> > construction single-input PHI nodes are sometimes eliminated as > >> >> > unnecessary, which causes the last dominating def to be reverted to an > >> >> > earlier EBB and the move range restriction failing as before. To > >> >> > remedy > >> >> > this, we mark PHIs that we create as 'persistent' and do not eliminate > >> >> > them where previously they would be replaced by their single input. > >> >> > (To > >> >> > remain aligned with the design goal of minimizing memory overhead that > >> >> > RTL-SSA incurs, the m_has_been_superceded flag has been repurposed to > >> >> > track this in the phi_info class.) > >> >> > > >> >> > Bootstrapped and regtested on aarch64-linux-gnu; no performance > >> >> > regressions across SPEC2017. New testcase added as provided by Alex > >> >> > Coplan in the BZ, many thanks for that. > >> >> > >> >> Thanks for looking at this and for the nice explanation. > >> >> I agree that the underlying problem is the lack of a live-out use. > >> >> However, rather than add a separate mechanism for this case, I think we > >> >> should instead force a phi to be created from the outset. That will > >> >> make sure that EBBs with a single dominating live-in definition > >> >> (as here) are handled similarly to EBBs with potentially multiple > >> >> incoming live-in definitions. > >> >> > >> >> Also, rather than add a flag to the phi, I think we can test whether > >> >> the next definition is in the same EBB. This should also ensure that we > >> >> handle EBBs that start out with potentially multiple incoming live-in > >> >> definitions but eventually simplify to a single value. > >> > > >> > Hi Richard, apologies for the slow reply. > >> > > >> > I obviously agree that your approach is the right one to take. I can > >> > confirm that your patch fixes the original testcase; what is still > >> > missing > >> > for it to pass bootstrap/regtest is handling of chains of degenerate PHIs > >> > which didn't occur before. Where before your patch phi->input_value (0) > >> > would always point to a "real" definition (either an insn or a > >> > non-degenerate PHI), now a CFG can be conceived where you'd have to > >> > traverse a chain of (non-simplified) degenerate PHIs to get to that > >> > definition. > >> > >> Argh. I should have predicted that, sorry. > >> > >> > I've made two main fixes on top of your patch to cater to this new > >> > possibility: (1) in function_info::live_out_value (), when one > >> > degenerate > >> > PHI is feeding into another, we need to use the former as the input for > >> > the latter, as otherwise we'd be setting the latter's def to NULL; (2) in > >> > look_through_degenerate_phi (), we need to walk the PHI chain as > >> > explained > >> > before (this also necessitates a small assertion change in > >> > function_info::make_use_available ()). There will be a performance > >> > penalty associated with this walk in the worst case, but I believe that > >> > this should not be a problem on real-life inputs. > >> > >> I think look_through_degenerate_phi really does need to be O(1), > >> otherwise we risk quadratic complexity. > > > > Right, perhaps this needed to be a separate get_ultimate_def () function > > just for the needs of that one checking assert in > > function_info::make_use_available () ... > > > >> > >> So how about this compromise? It goes back to your original approach > >> of detecting the problematic case while building the EBB. But rather > >> than add the phis on-the-fly in record_block_live_out, the patch adds > >> them during add_phi_nodes. > >> > >> This preserves the property that I was trying to get from the proposal > >> above, namely that we would add the same live-out uses as we would for > >> a normal phi. > >> > >> For example, it means that if a register is live out from two blocks > >> in the EBB, we'd add live-out uses for both blocks. It also means that > >> if: > >> > >> - an EBB contains three blocks > >> - the register is live out of the first block and dies at that point > >> - the register is not referenced in the second block > >> - the register is defined in the third block > >> > >> the first block would get the live-out use, rather than the second. > >> It would still be possible to move the definition up to the second > >> block, if that proves to be useful. > > > > I think the key part here that tripped me up were calls to > > replace_phi () during the dominator walk. With those out of the way, it's > > certainly much easier to reason about the order in which things > > happen, and both approaches (add_phi_nodes ()- vs. place_phis ()-time) > > can probably be made to work. > > > >> > >> Bootstrapped & regression-tested on x86_64-linux-gnu and > >> powerpc64le-linux-gnu. Could you give it a spin on aarch64 too? > > > > No issues on aarch64-linux-gnu either, so AFAIC this should just go in > > as-is. > > OK, great! Thanks for the testing. > > OK to install? I've reattached the patch just in case.
Ok. > > Richard > >
