> On Jun 10, 2019, at 11:55 AM, Josh Poimboeuf <jpoim...@redhat.com> wrote: > > On Mon, Jun 10, 2019 at 06:45:52PM +0000, Nadav Amit wrote: >>> On Jun 10, 2019, at 11:33 AM, Josh Poimboeuf <jpoim...@redhat.com> wrote: >>> >>> On Wed, Jun 05, 2019 at 03:08:06PM +0200, Peter Zijlstra wrote: >>>> --- a/arch/x86/include/asm/static_call.h >>>> +++ b/arch/x86/include/asm/static_call.h >>>> @@ -2,6 +2,20 @@ >>>> #ifndef _ASM_STATIC_CALL_H >>>> #define _ASM_STATIC_CALL_H >>>> >>>> +#include <asm/asm-offsets.h> >>>> + >>>> +#ifdef CONFIG_HAVE_STATIC_CALL_INLINE >>>> + >>>> +/* >>>> + * This trampoline is only used during boot / module init, so it's safe >>>> to use >>>> + * the indirect branch without a retpoline. >>>> + */ >>>> +#define __ARCH_STATIC_CALL_TRAMP_JMP(key, func) >>>> \ >>>> + ANNOTATE_RETPOLINE_SAFE \ >>>> + "jmpq *" __stringify(key) "+" __stringify(SC_KEY_func) "(%rip) \n" >>>> + >>>> +#else /* !CONFIG_HAVE_STATIC_CALL_INLINE */ >>> >>> I wonder if we can simplify this (and drop the indirect branch) by >>> getting rid of the above cruft, and instead just use the out-of-line >>> trampoline as the default for inline as well. >>> >>> Then the inline case could fall back to the out-of-line implementation >>> (by patching the trampoline's jmp dest) before static_call_initialized >>> is set. >> >> I must be missing some context - but what guarantees that this indirect >> branch would be exactly 5 bytes long? Isn’t there an assumption that this >> would be the case? Shouldn’t there be some handling of the padding? > > We don't patch the indirect branch. It's just part of a temporary > trampoline which is called by the call site, and which does "jmp > key->func" during boot until static call initialization is done. > > (Though I'm suggesting removing that.)
Oh... I see. On another note - even if this branch is only executed during module initialization, it does seem safer to use a retpoline instead of an indirect branch (consider a branch that is run many times on one hardware thread on SMT, when STIBP is not set, and attacker code is run on the second thread). I guess you don’t simply call the retpoline code so since you don’t have a clobbered register to hold the target. But it still seems possible to use a retpoline. Anyhow, it might be a moot discussion if this code is removed.