jyknight added a comment.
Thanks for the comments, they help a little. But it's still somewhat confusing,
so let me write down what seems to be happening:
- Before emitting every instruction, a new MCMachineDependentFragment is now
emitted, of one of the multiple types:
- For most instructions, that'll be BranchPrefix.
- For things that need branch-alignment, it'll be BranchPadding, unless
there's a fused conditional before, in which case it's BranchSplit
- For fused conditionals, it'll be FusedJccPadding.
- After emitting an instruction that needs branch-alignment, all of those
previously-emitted MCMachineDependentFragment are updated to point to the
branch's fragment.
- Thus, every MCDataFragment now only contains a single instruction (this
property is depended upon for getInstSize, at least).
All the MCMachineDependentFragments in a region bounded by a branch at the end
and either a branch or a fragment-type which is not type in {FT_Data,
FT_MachineDependent, FT_Relaxable, FT_CompactEncodedInst} at the beginning,
will reference the ending branch instruction's fragment.
Then, when it comes time to do relaxation, every one of those
machine-dependent-fragments has the opportunity to grow its instruction a
little bit. The first instruction in a "block" will grow up to 5 segment
prefixes (via modifying the BranchPrefix fragment), and then if more is needed,
more prefixes will be added to the next instruction, and so on. Until you run
out of instructions in the region. At which point the BranchPadding or
FusedJccPadding types (right before the branch/fused-branch) will be able to
emit nops to achieve the desired alignment.
An alternative would be to simply emit NOPs before branches as needed. That
would be substantially simpler, since it would only require special handling
for a branch or a fused-branch. I assume things were done this
substantially-more-complex way in order to reduce performance cost of inserting
NOP instructions? Are there numbers for how much better it is to use segment
prefixes, vs a separate nop instruction? It seems a little bit surprising to me
that it would be that important, but I don't know...
I'll note that the method here has the semantic issue of making it effectively
impossible to ever evaluate an expression like ".if . - symbol == 24" (assuming
we're emitting instructions), since every instruction can now change size. I
suspect that will make it impossible to turn this on by default without
breaking a lot of assembly code. Previously, only certain instructions, like
branches or arithmetic ops with constant arguments of unknown value, could
change size.
================
Comment at: llvm/lib/MC/MCAssembler.cpp:1058
+/// is the address of the symbol, which would casuse performance regression.
+void MCAssembler::moveSymbol(const MCFragment *Src, MCFragment *Dst) const {
+ if (!(Src && Dst && Dst->getKind() == MCFragment::FT_MachineDependent))
----------------
I don't think this is necessary. AFAICT, the symbols should already be in the
right place -- pointing to the relax fragment, not the instruction itself,
without this. And removing all this moveSymbol/updateSymbolMap code doesn't
make any tests fail.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D70157/new/
https://reviews.llvm.org/D70157
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits