On 3/3/20 5:16 PM, Alistair Francis wrote: > The RISC-V spec specifies that when a write happens and the D bit is > clear the implementation will set the bit in the PTE. It does not > describe that the PTE being dirty means that we should provide write > access. This patch removes the write access granted to pages when the > dirty bit is set.
The W bit by itself says we should provide write access. It is an implementation detail that we *withhold* write access when D is clear (etc) so that we can trap, so that we can properly set D in the future. The page table walk associated with a read is allowed to cache all of the information it finds in the PTE during that walk. Which includes the D bit. If D is set, then we do not need to set it in future, so we do not need to trap, so we can immediately honor the W bit. If the guest changes R/W/X within a PTE (e.g. for mprotect), it is obvious that a TLB flush for that page must be done. It is no different if the guest changes A/D (e.g. for swapping). > Setting write permission on dirty PTEs results in userspace inside a > Hypervisor guest (VU) becoming corrupted. This appears to be because it > ends up with write permission in the second stage translation in cases > where we aren't doing a store. You've not really given any more information than last time this patch came around. I still think this must be a guest (or nested guest) bug related to clearing PTE bits and failing to flush the TLB properly. I don't see how it could be a qemu tlb flushing bug. The only primitive, sfence.vma, is quite heavy-handed and explicitly local to the thread. It may be a bug in qemu's implementation of second stage paging. Which is not yet upstream, and I haven't gone digging in the archives to find the patch. r~