sashiko.dev --
https://sashiko.dev/#/patchset/[email protected] -- wrote:
> commit 069c98442d3532bbf015817229b8db505210e97d
> Author: Kiryl Shutsemau (Meta) <[email protected]>
> Subject: userfaultfd: add UFFD_FEATURE_RWP_ASYNC for async fault resolution
[ ... ]
> Does this sequence create a clean but writable PMD?
> [ ... ] Could this result in modified data being silently discarded
> instead of written back during page reclaim?
> Usually, code paths establishing writable entries set the dirty bit
> together with the write bit, for instance by using
> pmd_mkwrite(pmd_mkdirty(pmd)).
The pattern intentionally mirrors do_numa_page() / numa_rebuild_single_mapping()
which has been in the kernel for years using the same sequence:
pte = pte_modify(old_pte, vma->vm_page_prot);
pte = pte_mkyoung(pte);
if (writable)
pte = pte_mkwrite(pte, vma);
with no pte_mkdirty(). The "writable" decision is fenced by
can_change_pte_writable(), which keeps the result safe in both the
shared and private cases:
- Private (can_change_private_pte_writable): only allows the upgrade
for PageAnonExclusive pages.
- Shared (can_change_shared_pte_writable): returns true only when
pte_dirty(pte). The dirty bit lives in _PAGE_CHG_MASK, so the
earlier pte_modify(pte, vma->vm_page_prot) preserves it; the final
PTE is writable + dirty.
The same applies to the PMD path through can_change_pmd_writable().
There is no "clean + writable" PTE/PMD escaping either branch.
> Similarly, does this create a clean but writable PTE?
> If the PTE is made writable without calling pte_mkdirty(), it might
> violate the invariant that writable PTEs must be dirty, [ ... ]
The "writable PTEs must be dirty" invariant is not a kernel-wide rule;
it depends on the architecture and the code path. Where the kernel
relies on pte_mkdirty() being called explicitly, can_change_pte_writable()
returns false and this path is not taken. do_uffd_rwp() is the same
shape as do_numa_page() and inherits its correctness arguments.
--
Kiryl Shutsemau / Kirill A. Shutemov