skan marked 3 inline comments as done. skan added inline comments.
================ Comment at: llvm/test/MC/X86/x86-64-align-branch-1b.s:10 +# CHECK: 0000000000000000 foo: +# CHECK-NEXT: 0: 64 89 04 25 01 00 00 00 movl %eax, %fs:1 +# CHECK-NEXT: 8: 2e 55 pushq %rbp ---------------- MaskRay wrote: > I think 1a.s and 1b.s should be merged. FileCheck supports > > --check-prefixes=CHECK,PREFIX5 > --check-prefixes=CHECK,PREFIX1 > > ``` > CHECK: common part > CHECK-NEXT: common part > PREFIX5: > PREFIX5-NEXT: > PREFIX1: > PREFIX1-NEXT: > CHECK: > CHECK-NEXT: > ``` > > ``` > % diff -U1 x86-64-align-branch-1[ab].s > # CHECK: 0000000000000000 foo: > -# CHECK-NEXT: 0: 64 64 64 64 89 04 25 01 00 00 00 movl %eax, %fs:1 > -# CHECK-NEXT: b: 55 pushq %rbp > -# CHECK-NEXT: c: 55 pushq %rbp > -# CHECK-NEXT: d: 55 pushq %rbp > +# CHECK-NEXT: 0: 64 89 04 25 01 00 00 00 movl %eax, %fs:1 > +# CHECK-NEXT: 8: 2e 55 pushq %rbp > +# CHECK-NEXT: a: 2e 55 pushq %rbp > +# CHECK-NEXT: c: 2e 55 pushq %rbp > # CHECK-NEXT: e: 48 89 e5 movq %rsp, %rbp > ``` > > Is there performance benefit to add 4 prefixes to the same instruction? Thanks for the knowledge! There is no performance benefit to add 4 prefixes to the same instruction. The instruction `movl %eax, %fs:1` has already has a prefix %fs (0x64), if option `-x86-align-branch-prefix-size=5` is used, we could add 4 more prefixes at most. The branch to be aligned needs 3-byte padding, so 3 prefixes are added to the move instruction. ================ Comment at: llvm/test/MC/X86/x86-64-align-branch-1c.s:2 +# Check only fused conditional jumps and conditional jumps are aligned with option --x86-align-branch-boundary=32 --x86-align-branch=fused+jcc --x86-align-branch-prefix-size=5 +# RUN: llvm-mc -filetype=obj -triple x86_64-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=fused+jcc --x86-align-branch-prefix-size=5 %s | llvm-objdump -d - | FileCheck %s + ---------------- MaskRay wrote: > The difference between 1a and 1c is that 1c does not allow "jmp", but in 1a > no jmp instructions get a prefix in the test, so it is unclear why 1c has > different output. ``` # CHECK-NEXT: 45: 2e 89 45 fc movl %eax, %cs:-4(%rbp) # CHECK-NEXT: 49: 89 75 f4 movl %esi, -12(%rbp) # CHECK-NEXT: 4c: 89 7d f8 movl %edi, -8(%rbp) # CHECK-NEXT: 4f: 89 75 f4 movl %esi, -12(%rbp) # CHECK-NEXT: 52: 89 75 f4 movl %esi, -12(%rbp) # CHECK-NEXT: 55: 89 75 f4 movl %esi, -12(%rbp) # CHECK-NEXT: 58: 89 75 f4 movl %esi, -12(%rbp) # CHECK-NEXT: 5b: 89 75 f4 movl %esi, -12(%rbp) # CHECK-NEXT: 5e: 5d popq %rbp # CHECK-NEXT: 5f: 5d popq %rbp # CHECK-NEXT: 60: eb 26 jmp {{.*}} ``` The prefix 2e is added at ``` 45: 2e 89 45 fc movl %eax, %cs:-4(%rbp) ``` ================ Comment at: llvm/test/MC/X86/x86-64-align-branch-1e.s:46 +# CHECK-NEXT: 5c: eb 27 jmp {{.*}} +# CHECK-NEXT: 5e: 90 nop +# CHECK-NEXT: 5f: 90 nop ---------------- MaskRay wrote: > This is weird. Comparing this with 1d, 1e allows more instruction types, yet > it inserts two NOPs which actually seems to degrade performance. Not all the target cpu support long nop, you can find related code in X86AsmBackend::writeNopData. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70157/new/ https://reviews.llvm.org/D70157 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits