On Thu, Mar 18, 2021 at 10:29:55PM -0500, Josh Poimboeuf wrote: > On Thu, Mar 18, 2021 at 06:11:17PM +0100, Peter Zijlstra wrote: > > When the compiler emits: "CALL __x86_indirect_thunk_\reg" for an > > indirect call, have objtool rewrite it to: > > > > ALTERNATIVE "call __x86_indirect_thunk_\reg", > > "call *%reg", ALT_NOT(X86_FEATURE_RETPOLINE) > > > > Additionally, in order to not emit endless identical > > .altinst_replacement chunks, use a global symbol for them, see > > __x86_indirect_alt_*. > > > > This also avoids objtool from having to do code generation. > > > > Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> > > This is better than I expected. Nice workaround for not generating > code.
Thanks :-) > > +.macro ALT_THUNK reg > > + > > + .align 1 > > + > > +SYM_FUNC_START_NOALIGN(__x86_indirect_alt_call_\reg) > > + ANNOTATE_RETPOLINE_SAFE > > +1: call *%\reg > > +2: .skip 5-(2b-1b), 0x90 > > +SYM_FUNC_END(__x86_indirect_alt_call_\reg) > > + > > +SYM_FUNC_START_NOALIGN(__x86_indirect_alt_jmp_\reg) > > + ANNOTATE_RETPOLINE_SAFE > > +1: jmp *%\reg > > +2: .skip 5-(2b-1b), 0x90 > > +SYM_FUNC_END(__x86_indirect_alt_jmp_\reg) > > This mysterious code needs a comment. Shouldn't it be in > .altinstr_replacement or something? Comment, yes, I suppose so. And no, if we stick it in .altinstr_replacement we'll throw them away with initmem and module alternative patching (which will also refer to these symbols) will go side-ways. > Also doesn't the alternative code already insert nops? Problem is that the {call,jmp} *%\reg thing is not fixed length. They're 2 or 3 bytes depending on which register is picked. We could make them all 3 long and insert 0,1 nop I suppose. Initially alternatives wouldn't re-optimize nops on patched things, it would simply add nops on. And I had the above be: 1: INSN *%\reg 2: .nops 5-(2b-1b) and we'd get a single right sized nop. But the .nops directive it too new, we support binutils that don't have it :/ Hence, it now reads: 2: .skip 5-(2b-1b), 0x90 End result is that alternative NOP optimizer patch at the start of the series that now also optimizes a bunch of cases that are unrelated and were previously missed -- but crucially, it covers this case too :-) Anyway, yes I could make it 3 long. > > +int arch_rewrite_retpoline(struct objtool_file *file, > > + struct instruction *insn, > > + struct reloc *reloc) > > +{ > > + struct symbol *sym; > > + char name[32] = ""; > > + > > + if (!strcmp(insn->sec->name, ".text.__x86.indirect_thunk")) > > + return 0; > > + > > + sprintf(name, "__x86_indirect_alt_%s_%s", > > + insn->type == INSN_JUMP_DYNAMIC ? "jmp" : "call", > > + reloc->sym->name + 21); > > + > > + sym = find_symbol_by_name(file->elf, name); > > + if (!sym) { > > + sym = elf_create_undef_symbol(file->elf, name); > > + if (!sym) { > > + WARN("elf_create_undef_symbol"); > > + return -1; > > + } > > + } > > + > > + elf_add_alternative(file->elf, insn, sym, > > + ALT_NOT(X86_FEATURE_RETPOLINE), 5, 5); > > + > > + return 0; > > +} > > Need to propagate the error. Oh, indeed so.