On Fri, Dec 02, 2016 at 11:03:50AM -0800, Linus Torvalds wrote: > I'd really rather rjust mark it noinline with a comment. That way the > return from the function acts as the control flow change.
Something like below? It boots in a guest but that doesn't mean anything. > 'sync_core()' doesn't help for other CPU's anyway, you need to do the > cross-call IPI. So worrying about other CPU's is *not* a valid reason > to keep a "sync_core()" call. Yeah, no, I'm not crazy about it either - I was just sanity-checking all call sites of apply_alternatives(). But as you say, we would've gotten much bigger problems if other CPUs would walk in there on us. > Seriously, the only reason I can see for "sync_core()" really is: > > - some deep non-serialized MSR access or similar (ie things like > firmware loading etc really might want it, and a mchine check might > want it) Yah, we do it in the #MC handler - apparently we need it there - and in the microcode loader to tickle out the version of the microcode currently applied into the MSR. > The issues with modifying code while another CPU may be just about to > access it is a separate issue. And as noted, "sync_core()" is not > sufficient for that, you have to do a whole careful dance with > single-byte debug instruction writes and then a final cross-call. > > See the whole "text_poke_bp()" and "text_poke()" for *that* whole > dance. That's a much more complex thing from the normal > apply_alternatives(). Yeah. --- diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 5cb272a7a5a3..b1d0c35e6dcb 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -346,7 +346,6 @@ static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr) local_irq_save(flags); add_nops(instr + (a->instrlen - a->padlen), a->padlen); - sync_core(); local_irq_restore(flags); DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ", @@ -359,9 +358,12 @@ static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr) * This implies that asymmetric systems where APs have less capabilities than * the boot processor are not handled. Tough. Make sure you disable such * features by hand. + * + * Marked "noinline" to cause control flow change and thus insn cache + * to refetch changed I$ lines. */ -void __init_or_module apply_alternatives(struct alt_instr *start, - struct alt_instr *end) +void __init_or_module noinline apply_alternatives(struct alt_instr *start, + struct alt_instr *end) { struct alt_instr *a; u8 *instr, *replacement; @@ -667,7 +669,6 @@ void *__init_or_module text_poke_early(void *addr, const void *opcode, unsigned long flags; local_irq_save(flags); memcpy(addr, opcode, len); - sync_core(); local_irq_restore(flags); /* Could also do a CLFLUSH here to speed up CPU recovery; but that causes hangs on some VIA CPUs. */ -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.