On Fri, Dec 2, 2016 at 9:38 AM, Andy Lutomirski <l...@kernel.org> wrote: > > apply_alternatives, unfortunately. It's performance-critical because > it's intensely stupid and does sync_core() for every single patch. > Fixing that would be nice, too.
So looking at text_poke_early(), that's very much a case that really shouldn't need any "sync_core()" at all as far as I can tell. Only the current CPU is running, and for local CPU I$ coherence all you need is a jump instruction, and even that is only on really old CPU's. From the PPro onwards (maybe even Pentium?) the I$ is entirely serialized as long as you change the data using the same linear address. So at most, that function could mark itsel f"noinline" just to guarantee that it will cause a control flow change before returning. The sync_core() seems entirely bogus. Same goes for optimize_nops() too. Linus