================
@@ -0,0 +1,265 @@
+//===- AArch64SRLTDefineSuperRegs.cpp 
-------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// When SubRegister Liveness Tracking (SRLT) is enabled, this pass adds
+// extra implicit-def's to instructions that define the low N bits of
+// a GPR/FPR register to also define the top bits, because all AArch64
+// instructions that write the low bits of a GPR/FPR also implicitly zero
+// the top bits.  For example, 'mov w0, w1' writes zeroes to the top 32-bits of
+// x0, so this pass adds a `implicit-def $x0` after register allocation.
+//
+// These semantics are originally represented in the MIR using `SUBREG_TO_REG`
+// which expresses that the top bits have been defined by the preceding
+// instructions, but during register coalescing this information is lost and in
+// contrast to when SRTL is disabled, when rewriting virtual -> physical
+// registers the implicit-defs are not added to the instruction.
+//
+// There have been several attempts to fix this in the coalescer [1], but each
+// iteration has exposed new bugs and the patch had to be reverted.
+// Additionally, the concept of adding 'implicit-def' of a virtual register is
+// particularly fragile and many places don't expect it (for example in
+// `X86::commuteInstructionImpl` the  code only looks at specific operands and
+// does not consider implicit-defs. Similar in `SplitEditor::addDeadDef` where
+// it traverses operand 'defs' rather than 'all_defs').
+//
+// We want a temporary solution that doesn't impact other targets and is 
simpler
+// and less intrusive than the patch proposed for the register coalescer [1], 
so
+// that we can enable SRLT for AArch64.
+//
+// The approach here is to just add the 'implicit-def' manually after rewriting
+// virtual regs -> phsyical regs. This still means that during the register
+// allocation process the dependences are not accurately represented in the MIR
+// and LiveIntervals, but there are several reasons why we believe this isn't a
+// problem in practice:
+// (A) The register allocator only spills entire virtual registers.
+//     This is additionally guarded by code in
+//     AArch64InstrInfo::storeRegToStackSlot/loadRegFromStackSlot
+//     where it checks if a register matches the expected register class.
+// (B) Rematerialization only happens when the instruction writes the full
+//     register.
----------------
gbossu wrote:

Why would re-materialisation be a problem? I think it general the approach of 
adding the implcit defs **after** regalloc is fine because we have a single 
RegUnit, meaning the interferences will be correctly computed. That is a 
consequence of (C).

I think the real issue is if regalloc does some kind of coalescing of its own 
when we haven't added the implicit defs yet. In particular, we need to check 
how `isCopyInstrImpl` behaves with virtual registers. (AFAIU, it is safe 
because a `ORR` won't be recognised as a copy if it defines a subreg)

https://github.com/llvm/llvm-project/pull/174188
_______________________________________________
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

Reply via email to