On Fri, Dec 2, 2016 at 11:24 AM, Linus Torvalds <torva...@linux-foundation.org> wrote: > On Fri, Dec 2, 2016 at 11:20 AM, Borislav Petkov <b...@alien8.de> wrote: >> >> Something like below? > > The optimize-nops thing needs it too, I think. > > Again, this will never matter in practice (even if somebody has a i486 > s till, the prefetch window size is like 16 bytes or something), but > from a documentation standpoint it's good.
How's this? /* * This function forces the icache and prefetched instruction stream to * catch up with reality in two very specific cases: * * a) Text was modified using one virtual address and is about to be executed * from the same physical page at a different virtual address. * * b) Text was modified on a different CPU, may subsequently be * executed on this CPU, and you want to make sure the new version * gets executed. This generally means you're calling this in a IPI. * * If you're calling this for a different reason, you're probably doing * it wrong. */ static inline void native_sync_core(void) { ... } The body will do a MOV-to-CR2 followed by jmp 1f; 1:. This sequence should be guaranteed to flush the pipeline on any real CPU. On Xen it will do IRET-to-self. I suppose it could be an unconditional IRET-to-self, but that's a good deal slower and not a whole lot simpler. Although if we start doing it right, performance won't really matter here. --Andy