================ @@ -0,0 +1,265 @@ +//===- AArch64SRLTDefineSuperRegs.cpp -------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// When SubRegister Liveness Tracking (SRLT) is enabled, this pass adds +// extra implicit-def's to instructions that define the low N bits of +// a GPR/FPR register to also define the top bits, because all AArch64 +// instructions that write the low bits of a GPR/FPR also implicitly zero +// the top bits. For example, 'mov w0, w1' writes zeroes to the top 32-bits of +// x0, so this pass adds a `implicit-def $x0` after register allocation. +// +// These semantics are originally represented in the MIR using `SUBREG_TO_REG` +// which expresses that the top bits have been defined by the preceding +// instructions, but during register coalescing this information is lost and in +// contrast to when SRTL is disabled, when rewriting virtual -> physical +// registers the implicit-defs are not added to the instruction. +// +// There have been several attempts to fix this in the coalescer [1], but each +// iteration has exposed new bugs and the patch had to be reverted. +// Additionally, the concept of adding 'implicit-def' of a virtual register is +// particularly fragile and many places don't expect it (for example in +// `X86::commuteInstructionImpl` the code only looks at specific operands and +// does not consider implicit-defs. Similar in `SplitEditor::addDeadDef` where +// it traverses operand 'defs' rather than 'all_defs'). +// +// We want a temporary solution that doesn't impact other targets and is simpler +// and less intrusive than the patch proposed for the register coalescer [1], so +// that we can enable SRLT for AArch64. +// +// The approach here is to just add the 'implicit-def' manually after rewriting +// virtual regs -> phsyical regs. This still means that during the register +// allocation process the dependences are not accurately represented in the MIR +// and LiveIntervals, but there are several reasons why we believe this isn't a +// problem in practice: +// (A) The register allocator only spills entire virtual registers. +// This is additionally guarded by code in +// AArch64InstrInfo::storeRegToStackSlot/loadRegFromStackSlot +// where it checks if a register matches the expected register class. +// (B) Rematerialization only happens when the instruction writes the full +// register. ---------------- gbossu wrote:
Why would re-materialisation be a problem? I think it general the approach of adding the implcit defs **after** regalloc is fine because we have a single RegUnit, meaning the interferences will be correctly computed. That is a consequence of (C). I think the real issue is if regalloc does some kind of coalescing of its own when we haven't added the implicit defs yet. In particular, we need to check how `isCopyInstrImpl` behaves with virtual registers. (AFAIU, it is safe because a `ORR` won't be recognised as a copy if it defines a subreg) https://github.com/llvm/llvm-project/pull/174188 _______________________________________________ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
