On 04.06.25 23:58, Michael Kelley wrote:
From: Michael Kelley <mhkli...@outlook.com> Sent: Tuesday, June 3, 2025 10:25 AM
From: David Hildenbrand <da...@redhat.com> Sent: Tuesday, June 3, 2025 12:55 AM
On 03.06.25 03:49, Michael Kelley wrote:
From: David Hildenbrand <da...@redhat.com> Sent: Monday, June 2, 2025 2:48 AM
[snip]
@@ -182,20 +221,34 @@ static vm_fault_t fb_deferred_io_track_page(struct
fb_info *info, unsigned long
}
/*
- * We want the page to remain locked from ->page_mkwrite until
- * the PTE is marked dirty to avoid mapping_wrprotect_range()
- * being called before the PTE is updated, which would leave
- * the page ignored by defio.
- * Do this by locking the page here and informing the caller
- * about it with VM_FAULT_LOCKED.
+ * The PTE must be marked writable before the defio deferred work runs
+ * again and potentially marks the PTE write-protected. If the order
+ * should be switched, the PTE would become writable without defio
+ * tracking the page, leaving the page forever ignored by defio.
+ *
+ * For vmalloc() framebuffers, the associated struct page is locked
+ * before releasing the defio lock. mm will later mark the PTE writaable
+ * and release the struct page lock. The struct page lock prevents
+ * the page from being prematurely being marked write-protected.
+ *
+ * For FBINFO_KMEMFB framebuffers, mm assumes there is no struct page,
+ * so the PTE must be marked writable while the defio lock is held.
*/
- lock_page(pageref->page);
+ if (info->flags & FBINFO_KMEMFB) {
+ unsigned long pfn = page_to_pfn(pageref->page);
+
+ ret = vmf_insert_mixed_mkwrite(vmf->vma, vmf->address,
+ __pfn_to_pfn_t(pfn,
PFN_SPECIAL));
Will the VMA have VM_PFNMAP or VM_MIXEDMAP set? PFN_SPECIAL is a
horrible hack.
In another thread, you mention that you use PFN_SPECIAL to bypass the
check in vm_mixed_ok(), so VM_MIXEDMAP is likely not set?
The VMA has VM_PFNMAP set, not VM_MIXEDMAP. It seemed like
VM_MIXEDMAP is somewhat of a superset of VM_PFNMAP, but maybe that's
a wrong impression.
VM_PFNMAP: nothing is refcounted except anon pages
VM_MIXEDMAP: anything with a "struct page" (pfn_valid()) is refcounted
pte_special() is a way for GUP-fast to distinguish these refcounted (can
GUP) from non-refcounted (camnnot GUP) pages mapped by PTEs without any
locks or the VMA being available.
Setting pte_special() in VM_MIXEDMAP on ptes that have a "struct page"
(pfn_valid()) is likely very bogus.
OK, good to know.
vm_mixed_ok() does a thorough job of validating the
use of __vm_insert_mixed(), and since what I did was allowed, I thought
perhaps it was OK. Your feedback has set me straight, and that's what I
needed. :-)
What exactly are you trying to achieve? :)
If it's mapping a page with a "struct page" and *not* refcounting it,
then vmf_insert_pfn() is the current way to achieve that in a VM_PFNMAP
mapping. It will set pte_special() automatically for you.
Yes, that's what I'm using to initially create the special PTE in the
.fault callback.
But the whole approach is moot with Alistair Popple's patch set that
eliminates pfn_t. Is there an existing mm API that will do mkwrite on a
special PTE in a VM_PFNMAP VMA? I didn't see one, but maybe I missed
it. If there's not one, I'll take a crack at adding it in the next version of my
patch set.
I assume you'd want vmf_insert_pfn_mkwrite(), correct? Probably
vmf_insert_pfn_prot() can be used by adding PAGE_WRITE to pgprot. (maybe
:) )
Ok, I'll look at that more closely. The sequence is that the special
PTE gets created with vmf_insert_pfn(). Then when the page is first
written to, the .pfn_mkwrite callback is invoked by mm. The question
is the best way for that callback to mark the existing PTE as writable.
FWIW, vmf_insert_pfn_prot() won't work. It calls insert_pfn() with
the "mkwrite" parameter set to 'false', in which case insert_pfn()
does nothing if the PTE already exists.
Ah, you are worried about the "already exists but is R/O case".
So I would need to create a new API that does appropriate validation
for a VM_PFNMAP VMA, and then calls insert_pfn() with the "mkwrite"
parameter set to 'true'.
IMHO, nothing would speak against vmf_insert_pfn_mkwrite().
Much better than using that "mixed" ... beauty of a function.
--
Cheers,
David / dhildenb