On Thu, Jan 04, 2018 at 10:25:35AM -0800, Linus Torvalds wrote:
> On Thu, Jan 4, 2018 at 10:17 AM, Alexei Starovoitov
> <alexei.starovoi...@gmail.com> wrote:
> >
> > Clearly Paul's approach to retpoline without lfence is faster.
> > I'm guessing it wasn't shared with amazon/intel until now and
> > this set of patches going to adopt it, right?
> >
> > Paul, could you share a link to a set of alternative gcc patches
> > that do retpoline similar to llvm diff ?
> 
> What is the alternative approach? Is it literally just doing a
> 
>       call 1f
> 1:    mov real_target,(%rsp)
>        ret
> 
> on the assumption that the "ret" will always just predict to that "1"
> due to the call stack?

Pretty much.
Paul's writeup: https://support.google.com/faqs/answer/7625886
tldr: jmp *%r11 gets converted to:
call set_up_target;
capture_spec:
  pause;
  jmp capture_spec;
set_up_target:
  mov %r11, (%rsp);
  ret;
where capture_spec part will be looping speculatively.

Reply via email to