> On Jan 10, 2019, at 9:20 AM, Josh Poimboeuf <jpoim...@redhat.com> wrote: > > On Thu, Jan 10, 2019 at 09:32:23AM +0000, Nadav Amit wrote: >>> @@ -714,14 +714,39 @@ void *text_poke(void *addr, const void *opcode, >>> size_t len) >>> } >>> BUG_ON(!pages[0]); >>> local_irq_save(flags); >>> + >>> set_fixmap(FIX_TEXT_POKE0, page_to_phys(pages[0])); >>> if (pages[1]) >>> set_fixmap(FIX_TEXT_POKE1, page_to_phys(pages[1])); >>> - vaddr = (char *)fix_to_virt(FIX_TEXT_POKE0); >>> - memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len); >>> + >>> + vaddr = fix_to_virt(FIX_TEXT_POKE0) + ((unsigned long)addr & >>> ~PAGE_MASK); >>> + >>> + /* >>> + * Use a single access where possible. Note that a single unaligned >>> + * multi-byte write will not necessarily be atomic on x86-32, or if the >>> + * address crosses a cache line boundary. >>> + */ >>> + switch (len) { >>> + case 1: >>> + WRITE_ONCE(*(u8 *)vaddr, *(u8 *)opcode); >>> + break; >>> + case 2: >>> + WRITE_ONCE(*(u16 *)vaddr, *(u16 *)opcode); >>> + break; >>> + case 4: >>> + WRITE_ONCE(*(u32 *)vaddr, *(u32 *)opcode); >>> + break; >>> + case 8: >>> + WRITE_ONCE(*(u64 *)vaddr, *(u64 *)opcode); >>> + break; >>> + default: >>> + memcpy((void *)vaddr, opcode, len); >>> + } >>> + >> >> Even if Intel and AMD CPUs are guaranteed to run instructions from L1 >> atomically, this may break instruction emulators, such as those that >> hypervisors use. They might not read instructions atomically if on SMP VMs >> when the VM's text_poke() races with the emulated instruction fetch. >> >> While I can't find a reason for hypervisors to emulate this instruction, >> smarter people might find ways to turn it into a security exploit. > > Interesting point... but I wonder if it's a realistic concern. BTW, > text_poke_bp() also relies on undocumented behavior. > > The entire instruction doesn't need to be read atomically; just the > 32-bit call destination. Assuming the hypervisor is x86-64, and it uses > a 32-bit access to read the call destination (which seems logical), the > intra-cacheline reads will be atomic, as stated in the SDM.
At least in KVM, it doesn’t do so intentionally - eventually the emulated fetch is done using __copy_from_user(). So now you rely on __copy_from_user() doing it correctly. > If the above assumptions are not true, and the hypervisor reads the call > destination non-atomically (which seems unlikely IMO), even then I don't > see how it could be realistically exploitable. It would just oops from > calling a corrupt address. It might still be exploitable as DoS though (again, not that I think exactly how). Having said that, I might be negative just because I’ve put a lot of effort into avoiding this problem according to the SDM…