Yeah, it's a long standing API deficiency inside QEMU that we don't have a way to do atomic modifications in things like page-table-walk code: mostly you don't notice unless you go looking for it, but we really ought to fix this. Thanks for the unit test.
** Changed in: qemu Status: New => Confirmed -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1911351 Title: x86-64 MTTCG Does not update page table entries atomically Status in QEMU: Confirmed Bug description: It seems like the qemu tcg code for x86-64 doesn't write the access and dirty flags of the page table entries atomically. Instead, they first read the entry, see if they need to set the page table entry, and then overwrite the entry. So if you have two threads running at the same time, one accessing the virtual address over and over again, and the other modifying the page table entry, it is possible that after the second thread modifies the page table entry, qemu overwrites the value with the old page table entry value, with the access/dirty flags set. Here's a unit test that reproduces this behavior: https://github.com/mvanotti/kvm-unit- tests/commit/09f9722807271226a714b04f25174776454b19cd You can run it with: ``` /usr/bin/qemu-system-x86_64 --no-reboot -nodefaults \ -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 \ -vnc none -serial stdio -device pci-testdev \ -smp 4 -machine q35 --accel tcg,thread=multi \ -kernel x86/mmu-race.flat # -initrd /tmp/tmp.avvPpezMFf ``` Expected output (failure): ``` kvm-unit-tests$ make && /usr/bin/qemu-system-x86_64 --no-reboot -nodefaults -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -smp 4 -machine q35 --accel tcg,thread=multi -kernel x86/mmu-race.flat # -initrd /tmp/tmp.avvPpezMFf enabling apic enabling apic enabling apic enabling apic paging enabled cr0 = 80010011 cr3 = 627000 cr4 = 20 found 4 cpus PASS: Need more than 1 CPU Detected overwritten PTE: want: 0x000000000062e007 got: 0x000000000062d027 FAIL: PTE not overwritten PASS: All Reads were zero SUMMARY: 3 tests, 1 unexpected failures ``` This bug has allows user-to-root privilege escalation inside the guest VM: if the user is able overwrite an entry that belongs to a second- to-last level page table, and is able to allocate the referenced page, then the user would be in control of a last-level page table, being able to map any memory they want. This is not uncommon in situations where memory is being decomitted. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1911351/+subscriptions