On Wed, May 6, 2020 at 6:51 AM Peter Zijlstra <pet...@infradead.org> wrote: > > I was hoping for: > > bar: # @bar > movl %edi, .L_x$local(%rip) > retq > ponies: # @ponies > movq .Lfoo$local(%rip), %rax > testq %rax, %rax > jz 1f > jmpq *%rcx # TAILCALL > 1: > retq
If you want to just avoid the 'cmov', the best way to do that is to insert a barrier() on one side of the if-statement. That breaks the ability to turn the conditional jump into a cmov. HOWEVER. It looks like noth clang and gcc will move the indirect jump to the conditional sites, but then neither of them is smart enough to just turn the indirect jump into one direct jump. Strange. So you still get an indirect call for just the "ret" case. The code looks actively stupid with gcc: .L7: movl $__static_call_nop, %eax jmp *%rax clang: .LBB1_1: mov eax, offset __static_call_nop jmp rax # TAILCALL despite the barrier not being between those two points. The only difference is the assembler syntax. Odd. That's such a trivial and obvious optimization. But presumably it's a pattern that just doesn't happen normally. Linus