On Thu, Jan 04, 2018 at 10:25:35AM -0800, Linus Torvalds wrote:
> On Thu, Jan 4, 2018 at 10:17 AM, Alexei Starovoitov
> <alexei.starovoi...@gmail.com> wrote:
> >
> > Clearly Paul's approach to retpoline without lfence is faster.

Using pause rather than lfence does not represent a fundamental difference here.

A protected indirect branch is always adding ~25-30 cycles of overhead.

That this can be avoided in practice is a function of two key factors:
(1) Kernel code uses fewer indirect branches.
(2) The overhead can be avoided for hot indirect branches via devirtualization.
  e.g. the semantic equivalent of,
    if (ptr == foo)
      foo();
    else
      (*ptr)();
  Allowing foo() to be called directly, even though it was provided as an
  indirect.

> > I'm guessing it wasn't shared with amazon/intel until now and
> > this set of patches going to adopt it, right?
> >
> > Paul, could you share a link to a set of alternative gcc patches
> > that do retpoline similar to llvm diff ?
> 
> What is the alternative approach? Is it literally just doing a
> 
>       call 1f
> 1:    mov real_target,(%rsp)
>        ret
> 
> on the assumption that the "ret" will always just predict to that "1"
> due to the call stack?
> 
>                 Linus

Reply via email to