On Thu, Mar 18, 2021 at 10:29:55PM -0500, Josh Poimboeuf wrote:
> On Thu, Mar 18, 2021 at 06:11:17PM +0100, Peter Zijlstra wrote:
> > When the compiler emits: "CALL __x86_indirect_thunk_\reg" for an
> > indirect call, have objtool rewrite it to:
> > 
> >     ALTERNATIVE "call __x86_indirect_thunk_\reg",
> >                 "call *%reg", ALT_NOT(X86_FEATURE_RETPOLINE)
> > 
> > Additionally, in order to not emit endless identical
> > .altinst_replacement chunks, use a global symbol for them, see
> > __x86_indirect_alt_*.
> > 
> > This also avoids objtool from having to do code generation.
> > 
> > Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
> 
> This is better than I expected.  Nice workaround for not generating
> code.

Thanks :-)

> > +.macro ALT_THUNK reg
> > +
> > +   .align 1
> > +
> > +SYM_FUNC_START_NOALIGN(__x86_indirect_alt_call_\reg)
> > +   ANNOTATE_RETPOLINE_SAFE
> > +1: call    *%\reg
> > +2: .skip   5-(2b-1b), 0x90
> > +SYM_FUNC_END(__x86_indirect_alt_call_\reg)
> > +
> > +SYM_FUNC_START_NOALIGN(__x86_indirect_alt_jmp_\reg)
> > +   ANNOTATE_RETPOLINE_SAFE
> > +1: jmp     *%\reg
> > +2: .skip   5-(2b-1b), 0x90
> > +SYM_FUNC_END(__x86_indirect_alt_jmp_\reg)
> 
> This mysterious code needs a comment.  Shouldn't it be in
> .altinstr_replacement or something?

Comment, yes, I suppose so. And no, if we stick it in
.altinstr_replacement we'll throw them away with initmem and module
alternative patching (which will also refer to these symbols) will go
side-ways.

> Also doesn't the alternative code already insert nops?

Problem is that the {call,jmp} *%\reg thing is not fixed length. They're
2 or 3 bytes depending on which register is picked.

We could make them all 3 long and insert 0,1 nop I suppose.

Initially alternatives wouldn't re-optimize nops on patched things, it
would simply add nops on. And I had the above be:

1:      INSN    *%\reg
2:      .nops   5-(2b-1b)

and we'd get a single right sized nop. But the .nops directive it too
new, we support binutils that don't have it :/

Hence, it now reads:

2:      .skip   5-(2b-1b), 0x90

End result is that alternative NOP optimizer patch at the start of the
series that now also optimizes a bunch of cases that are unrelated and
were previously missed -- but crucially, it covers this case too :-)

Anyway, yes I could make it 3 long.

> > +int arch_rewrite_retpoline(struct objtool_file *file,
> > +                      struct instruction *insn,
> > +                      struct reloc *reloc)
> > +{
> > +   struct symbol *sym;
> > +   char name[32] = "";
> > +
> > +   if (!strcmp(insn->sec->name, ".text.__x86.indirect_thunk"))
> > +           return 0;
> > +
> > +   sprintf(name, "__x86_indirect_alt_%s_%s",
> > +           insn->type == INSN_JUMP_DYNAMIC ? "jmp" : "call",
> > +           reloc->sym->name + 21);
> > +
> > +   sym = find_symbol_by_name(file->elf, name);
> > +   if (!sym) {
> > +           sym = elf_create_undef_symbol(file->elf, name);
> > +           if (!sym) {
> > +                   WARN("elf_create_undef_symbol");
> > +                   return -1;
> > +           }
> > +   }
> > +
> > +   elf_add_alternative(file->elf, insn, sym,
> > +                       ALT_NOT(X86_FEATURE_RETPOLINE), 5, 5);
> > +
> > +   return 0;
> > +}
> 
> Need to propagate the error.

Oh, indeed so.

Reply via email to