Re: [PATCH v1 02/11] mm: convert track_pfn_insert() to pfnmap_sanitize_pgprot()

2025-04-29 Thread Peter Xu
pper layers, e.g. default pci_iomap() for MMIOs doesn't need to specify anything on cache mode. There's some pci_iomap_wc() variance, but still looks like only the internal has full control.. -- Peter Xu

Re: [PATCH v1 02/11] mm: convert track_pfn_insert() to pfnmap_sanitize_pgprot()

2025-04-29 Thread Peter Xu
On Mon, Apr 28, 2025 at 10:37:49PM +0200, David Hildenbrand wrote: > On 28.04.25 18:21, Peter Xu wrote: > > On Mon, Apr 28, 2025 at 04:58:46PM +0200, David Hildenbrand wrote: > > > > > > > > What it does on PAT (only implementation so far ...) is looking up the &g

Re: [PATCH v1 05/11] mm: convert VM_PFNMAP tracking to pfnmap_track() + pfnmap_untrack()

2025-04-28 Thread Peter Xu
fd_ctx; /* 176 0 */ Not something that matters.. but just in case you didn't use the expected config file you wanted to use.. -- Peter Xu

Re: [PATCH v1 02/11] mm: convert track_pfn_insert() to pfnmap_sanitize_pgprot()

2025-04-28 Thread Peter Xu
ing up the memtype, so I would expect it is > > guaranteed all pfns under the same vma is following the verified (and same) > > memtype? > > The whole point of track_pfn_insert() is that it is used when we *don't* use > reserve_pfn_range()->track_pfn_remap(), no? > > track_pfn_remap() would check the whole range that gets mapped, so > track_pfn_insert() user must similarly check the whole range that gets > mapped. > > Note that even track_pfn_insert() is already pretty clear on the intended > usage: "called when a _new_ single pfn is established" We need to define "new" then.. But I agree it's not crystal clear at least. I think I just wasn't the first to assume it was reserved, see this (especially, the "Expectation" part..): commit 5180da410db6369d1f95c9014da1c9bc33fb043e Author: Suresh Siddha Date: Mon Oct 8 16:28:29 2012 -0700 x86, pat: separate the pfn attribute tracking for remap_pfn_range and vm_insert_pfn With PAT enabled, vm_insert_pfn() looks up the existing pfn memory attribute and uses it. Expectation is that the driver reserves the memory attributes for the pfn before calling vm_insert_pfn(). -- Peter Xu

Re: [PATCH v1 05/11] mm: convert VM_PFNMAP tracking to pfnmap_track() + pfnmap_untrack()

2025-04-28 Thread Peter Xu
On Fri, Apr 25, 2025 at 10:36:55PM +0200, David Hildenbrand wrote: > On 25.04.25 22:23, Peter Xu wrote: > > On Fri, Apr 25, 2025 at 10:17:09AM +0200, David Hildenbrand wrote: > > > Let's use our new interface. In remap_pfn_range(), we'll now decide > > > wheth

Re: [PATCH v1 02/11] mm: convert track_pfn_insert() to pfnmap_sanitize_pgprot()

2025-04-25 Thread Peter Xu
On Fri, Apr 25, 2025 at 09:48:50PM +0200, David Hildenbrand wrote: > On 25.04.25 21:31, Peter Xu wrote: > > On Fri, Apr 25, 2025 at 10:17:06AM +0200, David Hildenbrand wrote: > > > ... by factoring it out from track_pfn_remap(). > > > > > > For PMDs/PUDs

Re: [PATCH v1 05/11] mm: convert VM_PFNMAP tracking to pfnmap_track() + pfnmap_untrack()

2025-04-25 Thread Peter Xu
it slightly slower than vma->pfnmap_track_ctx ref when looking up pfn when holding a vma ref, but I assume it's ok considering that track/untrack should be slow path for pfnmaps, and pfnmaps shouldn't be a huge lot. I didn't think further, but if that'll work it'll definitely avoids the additional fields on x86 vmas. I'm curious whether you explored that direction, or maybe it's a known decision that the 8 bytes isn't a concern. Thanks, -- Peter Xu

Re: [PATCH v1 04/11] mm/memremap: convert to pfnmap_track() + pfnmap_untrack()

2025-04-25 Thread Peter Xu
(range)); > err_kasan: > - untrack_pfn(NULL, PHYS_PFN(range->start), range_len(range), true); > + pfnmap_untrack(PHYS_PFN(range->start), range_len(range)); Not a huge deal, but maybe we could merge this and previous patch? It might be easier to reference the impl when reading the call site changes. > err_pfn_remap: > pgmap_array_delete(range); > return error; > -- > 2.49.0 > -- Peter Xu

Re: [PATCH v1 02/11] mm: convert track_pfn_insert() to pfnmap_sanitize_pgprot()

2025-04-25 Thread Peter Xu
563,7 @@ vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct > *vma, unsigned long addr, > if (!pfn_modify_allowed(pfn, pgprot)) > return VM_FAULT_SIGBUS; > > - track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV)); > + pfnmap_sanitize_pgprot(pfn, PAGE_SIZE, &pgprot); > > return insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot, > false); > @@ -2626,7 +2626,7 @@ static vm_fault_t __vm_insert_mixed(struct > vm_area_struct *vma, > if (addr < vma->vm_start || addr >= vma->vm_end) > return VM_FAULT_SIGBUS; > > - track_pfn_insert(vma, &pgprot, pfn); > + pfnmap_sanitize_pgprot(pfn_t_to_pfn(pfn), PAGE_SIZE, &pgprot); > > if (!pfn_modify_allowed(pfn_t_to_pfn(pfn), pgprot)) > return VM_FAULT_SIGBUS; > -- > 2.49.0 > -- Peter Xu

Re: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

2023-08-03 Thread Peter Xu
guess people are just waiting for hardware vendors to support device page faults, like processors - then we can synchronize with the device using the device IOMMU page tables (by clearing it at proper time and blocks DMA writes). Thanks, -- Peter Xu

Re: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

2023-08-01 Thread Peter Xu
ing ram discard just like what vfio does already. Thanks, -- Peter Xu

Re: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

2023-07-31 Thread Peter Xu
ot after VM started there, so either not affected or maybe there's chance it'll work. IIUC it's then the same as VFIO attached then we try to blow some pages away from anything like virtio-balloon - AFAIR qemu just explicitly don't allow that to happen. See vfio_ram_block_discard_disable(). > > > FOLL_GET). The good thing is then the fd can be the guest memory file > > itself. With that, we can mmap() over the shmem/hugetlb in whatever vma > > and whatever process. Truncation (and actually everything... e.g. page > > migration, swapping, ... which will be disabled if we use PFNMAP pins) will > > just all start to work, afaiu. > IIUC, we'd not be able to use the fd of the guest memory file because the > dmabuf fds are expected to have constant size that reflects the size of the > buffer that is being shared. I just don't think it'd be feasible given all the > other restrictions: > https://www.kernel.org/doc/html/latest/driver-api/dma-buf.html?highlight=dma_buf#userspace-interface-notes Yeah I also don't know well on the dmabuf APIs, but I think if the page must be pinned for real world DMA then it's already another story to me.. what I said on the [guest_mem_fd, offset_array] tuple idea could only (if still possible..) work if the udmabuf access is only from the processor side, never from the device. Thanks, -- Peter Xu

Re: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

2023-07-27 Thread Peter Xu
n be e.g. a list of file offsets that points to the pages (rather than pinning the pages using FOLL_GET). The good thing is then the fd can be the guest memory file itself. With that, we can mmap() over the shmem/hugetlb in whatever vma and whatever process. Truncation (and actually everything... e.g. page migration, swapping, ... which will be disabled if we use PFNMAP pins) will just all start to work, afaiu. Thanks, -- Peter Xu

Re: [PATCH v2 2/2] udmabuf: Add back support for mapping hugetlb pages (v2)

2023-07-27 Thread Peter Xu
tlb and shmem > > pages is different. hugetlb requires the pages exist in the page cache > > while shmem will create/add pages to the cache if necessary. > Ok; got it. Would it be nice to do the same for both (perhaps always try to allocate)? Thanks, -- Peter Xu

Re: [PATCH v1 0/2] udmabuf: Add back support for mapping hugetlb pages

2023-06-27 Thread Peter Xu
On Tue, Jun 27, 2023 at 12:52:34PM -0300, Jason Gunthorpe wrote: > On Mon, Jun 26, 2023 at 03:04:21PM -0400, Peter Xu wrote: > > On Mon, Jun 26, 2023 at 03:18:48PM -0300, Jason Gunthorpe wrote: > > > On Mon, Jun 26, 2023 at 08:14:27PM +0200, David Hildenbrand wrote: > > >

Re: [PATCH v1 0/2] udmabuf: Add back support for mapping hugetlb pages

2023-06-26 Thread Peter Xu
st udmabuf should just only work with shared memories; not sure on iommufd, though. -- Peter Xu

Re: [PATCH v1 0/2] udmabuf: Add back support for mapping hugetlb pages

2023-06-26 Thread Peter Xu
Having said that, one option I could think of is to probably install a > mmu_notifier > associated with the relevant pages in udmabuf and once we get notified about > any invalidation event concerning any of the pages, we'd fail any subsequent > attempt to access these pages and propagate the error across the stack. Sounds right, maybe it needs to go back to the old GUP solution, though, as mmu notifiers are also mm-based not fd-based. Or to be explicit, I think it'll be pin_user_pages(FOLL_LONGTERM) with the new API. It'll also solve the movable pages issue on pinning. > > However, it feels like udmabuf is not the right place to handle this issue > because > there are very limited options for taking proper corrective action if Qemu > decides > to punch a hole in Guest memory that takes out pages that are pinned. I'm not familiar with the use case that much, but IMHO it's fine if the driver relies on proper behaviors of userapp to work. IIUC the worst case here is the udmabuf holds some pages that are not the pages of the guest mem anymore, but it only happens on misbehaved userspace, then it looks all fine as long as they can at least be properly released when releasing the udmabuf. It'll be better if the udmabuf can fail hard when detecting this, but IMHO even that can be optional depending on the need, while any corrective action will be even one step further. Thanks, -- Peter Xu

Re: [PATCH v1 0/2] udmabuf: Add back support for mapping hugetlb pages

2023-06-23 Thread Peter Xu
On Fri, Jun 23, 2023 at 01:37:58PM -0300, Jason Gunthorpe wrote: > On Fri, Jun 23, 2023 at 12:35:45PM -0400, Peter Xu wrote: > > > It seems the previous concern on using gup was majorly fork(), if this is > > it: > > > > https://patchwork.freedesktop.org/pa

Re: [PATCH v1 0/2] udmabuf: Add back support for mapping hugetlb pages

2023-06-23 Thread Peter Xu
> > > > > > > There are *probably* more issues on the QEMU side when udmabuf is > > paired > > with things like MADV_DONTNEED/FALLOC_FL_PUNCH_HOLE used for > > virtio-balloon, virtio-mem, postcopy live migration, ... for example, in > > the vfio/vdpa case we make sure that we disallow most of these, because > > otherwise there can be an accidental "disconnect" between the pages > > mapped into the VM (guest view) and the pages mapped into the IOMMU > > (device view), for example, after a reboot. > Ok; I am not sure if I can figure out if there is any acceptable way to > address > these issues but given the current constraints associated with udmabuf, what > do you suggest is the most reasonable way to deal with these problems you > have identified? > > Thanks, > Vivek > > > > > -- > > Cheers, > > > > David / dhildenb > -- Peter Xu

Re: [PATCH v9 02/14] mm: move page zone helpers from mm.h to mmzone.h

2023-06-15 Thread Peter Xu
On Thu, Jun 15, 2023 at 09:15:26PM +0100, Matthew Wilcox wrote: > On Thu, Jun 15, 2023 at 03:33:12PM -0400, Peter Xu wrote: > > My question is whether page_zonenum() is ready for taking all kinds of tail > > pages? > > > > Zone device tail pages all look fine, per memm

Re: [PATCH v9 02/14] mm: move page zone helpers from mm.h to mmzone.h

2023-06-15 Thread Peter Xu
; is_pci_p2pdma_page -> is_zone_device_page -> page_zonenum I'm worried it'll just read fake things, but maybe I just missed something? Thanks, -- Peter Xu

Re: [PATCH v10 07/10] mm: Device exclusive memory access

2021-06-15 Thread Peter Xu
On Tue, Jun 15, 2021 at 01:08:11PM +1000, Alistair Popple wrote: > On Saturday, 12 June 2021 1:01:42 AM AEST Peter Xu wrote: > > On Fri, Jun 11, 2021 at 01:43:20PM +1000, Alistair Popple wrote: > > > On Friday, 11 June 2021 11:00:34 AM AEST Peter Xu wrote: > > > >

Re: [PATCH v10 07/10] mm: Device exclusive memory access

2021-06-11 Thread Peter Xu
On Fri, Jun 11, 2021 at 01:43:20PM +1000, Alistair Popple wrote: > On Friday, 11 June 2021 11:00:34 AM AEST Peter Xu wrote: > > On Fri, Jun 11, 2021 at 09:17:14AM +1000, Alistair Popple wrote: > > > On Friday, 11 June 2021 9:04:19 AM AEST Peter Xu wrote: > > > > On F

Re: [PATCH v10 07/10] mm: Device exclusive memory access

2021-06-10 Thread Peter Xu
On Fri, Jun 11, 2021 at 09:17:14AM +1000, Alistair Popple wrote: > On Friday, 11 June 2021 9:04:19 AM AEST Peter Xu wrote: > > External email: Use caution opening links or attachments > > > > > > On Fri, Jun 11, 2021 at 12:21:26AM +1000, Alistair Popple wrote: > &

Re: [PATCH v10 07/10] mm: Device exclusive memory access

2021-06-10 Thread Peter Xu
ns_huge() should return true above, so we'll skip follow_page_pte(); then we'll check FOLL_SPLIT_PMD and do the split, then the CLEAR notify. Hmm.. Did I miss something? -- Peter Xu

Re: [PATCH v10 07/10] mm: Device exclusive memory access

2021-06-10 Thread Peter Xu
s it also mean that if there's a real THP it won't really work? As then FOLL_SPLIT_PMD will start to trigger that CLEAR notify too, I think.. -- Peter Xu

Re: [PATCH v10 07/10] mm: Device exclusive memory access

2021-06-09 Thread Peter Xu
On Wed, Jun 09, 2021 at 07:38:04PM +1000, Alistair Popple wrote: > On Wednesday, 9 June 2021 4:33:52 AM AEST Peter Xu wrote: > > On Mon, Jun 07, 2021 at 05:58:52PM +1000, Alistair Popple wrote: > > > > [...] > > > > > +static bool page_make_de

Re: [PATCH v10 07/10] mm: Device exclusive memory access

2021-06-08 Thread Peter Xu
L_SPLIT_PMD, > +pages, NULL, NULL); > + for (i = 0; i < npages; i++, start += PAGE_SIZE) { > + if (!trylock_page(pages[i])) { > + put_page(pages[i]); > + pages[i] = NULL; > + continue; > + } > + > + if (!page_make_device_exclusive(pages[i], mm, start, owner)) { > + unlock_page(pages[i]); > + put_page(pages[i]); > + pages[i] = NULL; > + } > + } > + > + return npages; > +} > +EXPORT_SYMBOL_GPL(make_device_exclusive_range); > +#endif > + > void __put_anon_vma(struct anon_vma *anon_vma) > { > struct anon_vma *root = anon_vma->root; > -- > 2.20.1 > -- Peter Xu

Re: [PATCH v10 06/10] mm/memory.c: Allow different return codes for copy_nonpresent_pte()

2021-06-08 Thread Peter Xu
_WARN_ON_ONCE(1); > } > + > + /* We've captured and resolved the error. Reset, try again. */ Maybe better as: /* * We've resolved all error even if there is, reset error code and try * again if necessary. */ as it also covers the no-error path. But I guess not a big deal.. Reviewed-by: Peter Xu Thanks, > + ret = 0; > + > if (addr != end) > goto again; > out: > -- > 2.20.1 > -- Peter Xu

Re: [PATCH v10 05/10] mm: Rename migrate_pgmap_owner

2021-06-08 Thread Peter Xu
gt; can also benefit from this filtering, so rename the > 'migrate_pgmap_owner' field to 'owner' and create a new notifier > initialisation function to initialise this field. > > Signed-off-by: Alistair Popple > Suggested-by: Peter Xu Reviewed-by: Peter Xu -- Peter Xu

Re: [PATCH v9 07/10] mm: Device exclusive memory access

2021-06-04 Thread Peter Xu
On Fri, Jun 04, 2021 at 11:07:42AM +1000, Alistair Popple wrote: > On Friday, 4 June 2021 12:47:40 AM AEST Peter Xu wrote: > > External email: Use caution opening links or attachments > > > > On Thu, Jun 03, 2021 at 09:39:32PM +1000, Alistair Popple wrote: > > > Rec

Re: [PATCH v9 07/10] mm: Device exclusive memory access

2021-06-03 Thread Peter Xu
code. Could you elaborate? -- Peter Xu

Re: [PATCH v9 07/10] mm: Device exclusive memory access

2021-06-02 Thread Peter Xu
age should not account for use by the device > (are GPU workloads - access once and discard?) Hmm, besides the aging info, this reminded me: do we need to isolate the page from lru too when marking device exclusive access? Afaict the current patch didn't do that so I think it's reclaimable. If we still have the rmap then we'll get a mmu notify CLEAR when unmapping that special pte, so device driver should be able to drop the ownership. However we dropped the rmap when marking exclusive. Now I don't know whether and how it'll work if page reclaim runs with the page being exclusively owned if without isolating the page.. -- Peter Xu

Re: [PATCH v9 07/10] mm: Device exclusive memory access

2021-05-28 Thread Peter Xu
ling split_huge_page() rather than split_huge_pmd(). But shouldn't FOLL_SPLIT_PMD filled in small pfns for each pte? See __split_huge_pmd_locked(): for (i = 0, addr = haddr; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE) { ... } else { entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot)); ... } ... set_pte_at(mm, addr, pte, entry); } Then iiuc the coming follow_page_pte() will directly fetch the small pages? -- Peter Xu

Re: [PATCH v9 07/10] mm: Device exclusive memory access

2021-05-27 Thread Peter Xu
ret = -EBUSY; } else { spin_unlock(ptl); split_huge_pmd(vma, pmd, address); ret = pte_alloc(mm, pmd) ? -ENOMEM : 0; } return ret ? ERR_PTR(ret) : follow_page_pte(vma, address, pmd, flags, &ctx->pgmap); } So I thought all pages are small pages? -- Peter Xu

Re: [PATCH v9 06/10] mm/memory.c: Allow different return codes for copy_nonpresent_pte()

2021-05-26 Thread Peter Xu
On Thu, May 27, 2021 at 11:20:36AM +1000, Alistair Popple wrote: > On Thursday, 27 May 2021 5:50:05 AM AEST Peter Xu wrote: > > On Mon, May 24, 2021 at 11:27:21PM +1000, Alistair Popple wrote: > > > Currently if copy_nonpresent_pte() returns a non-zero value it is > > >

Re: [PATCH v9 06/10] mm/memory.c: Allow different return codes for copy_nonpresent_pte()

2021-05-26 Thread Peter Xu
EM; /* We've captured and resolved the error. Reset, try again. */ ret = 0; } We didn't reset "ret" in entry.val case (maybe we should?). Then in the next round of "goto again" if "ret" is unluckily untouched, it could reach the 2nd if check, and I think it could cause an unexpected page_copy_prealloc(). -- Peter Xu

Re: [PATCH v9 05/10] mm: Rename migrate_pgmap_owner

2021-05-26 Thread Peter Xu
callers are unaffected. Not a big deal, though: Reviewed-by: Peter Xu Thanks, -- Peter Xu

Re: [PATCH v9 07/10] mm: Device exclusive memory access

2021-05-26 Thread Peter Xu
-- > > v9: > * Split rename of migrate_pgmap_owner into a separate patch. > * Added comments explaining SWP_DEVICE_EXCLUSIVE_* entries. > * Renamed try_to_protect{_one} to page_make_device_exclusive{_one} based > somewhat on a suggestion from Peter Xu. I was never particular

Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-19 Thread Peter Xu
On Wed, May 19, 2021 at 10:28:42AM -0300, Jason Gunthorpe wrote: > On Tue, May 18, 2021 at 07:45:05PM -0400, Peter Xu wrote: > > On Tue, May 18, 2021 at 08:03:27PM -0300, Jason Gunthorpe wrote: > > > Logically during fork all these device exclusive pages should be > > >

Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-19 Thread Peter Xu
On Wed, May 19, 2021 at 11:11:55PM +1000, Alistair Popple wrote: > On Wednesday, 19 May 2021 10:15:41 PM AEST Peter Xu wrote: > > External email: Use caution opening links or attachments > > > > On Wed, May 19, 2021 at 09:04:53PM +1000, Alistair Popple wrote: > >

Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-19 Thread Peter Xu
On Wed, May 19, 2021 at 08:49:01PM +1000, Alistair Popple wrote: > On Wednesday, 19 May 2021 7:16:38 AM AEST Peter Xu wrote: > > External email: Use caution opening links or attachments > > > > > > On Wed, Apr 07, 2021 at 06:42:35PM +1000, Alistair Popple wrote: > &

Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-19 Thread Peter Xu
ng about these two solutions and why we choose the GUP way? -- Peter Xu

Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-19 Thread Peter Xu
erspace device driver should fork() within busy interaction with the device underneath.. In all cases, please still consider to keep them in copy_nonpresent_pte() (and if to rework, separating patches would be great). Thanks, -- Peter Xu

Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-18 Thread Peter Xu
respond to the changes in copy_nonpresent_pte()) in this patch to guarantee it. I also hope we don't make copy_pte_range() even more complicated just to do the lock_page() right, so we could fail the fork() if the lock is hard to take. -- Peter Xu

Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-18 Thread Peter Xu
_DONTFORK in the user app), meanwhile make sure mapcount(page)==1 before granting the page to the device, so that this will guarantee this mm owns this page forever, I think? It'll simplify fork() too as a side effect, since VM_DONTCOPY vma go away when fork. -- Peter Xu

Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-18 Thread Peter Xu
On Tue, May 18, 2021 at 04:45:09PM -0300, Jason Gunthorpe wrote: > On Tue, May 18, 2021 at 02:01:36PM -0400, Peter Xu wrote: > > > > Indeed it'll be odd for a COW page since for COW page then it means > > > > after > > > > parent/child writting t

Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-18 Thread Peter Xu
On Tue, May 18, 2021 at 02:33:34PM -0300, Jason Gunthorpe wrote: > On Tue, May 18, 2021 at 01:27:42PM -0400, Peter Xu wrote: > > > I also have a pure and high level question regarding a process fork() when > > there're device exclusive ptes: would the two proces

Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-18 Thread Peter Xu
). > > IIUC this is the _only_ reason that we passed in "address" into > > try_to_protect() too, and further into the ttp_args. > > Yes, this is why "address" is passed up to ttp_args. > > > The odd part is the remote GUP should have walked the page ta

Re: [PATCH v8 1/8] mm: Remove special swap entry functions

2021-05-18 Thread Peter Xu
On Tue, May 18, 2021 at 09:58:09PM +1000, Alistair Popple wrote: > On Tuesday, 18 May 2021 12:17:32 PM AEST Peter Xu wrote: > > On Wed, Apr 07, 2021 at 06:42:31PM +1000, Alistair Popple wrote: > > > +static inline struct page *pfn_swap_entry_to_page(swp_entry_t entry) > >

Re: [PATCH v8 1/8] mm: Remove special swap entry functions

2021-05-17 Thread Peter Xu
undergoing migration. > + */ > +static inline bool is_pfn_swap_entry(swp_entry_t entry) > +{ > + return is_migration_entry(entry) || is_device_private_entry(entry); > +} -- Peter Xu

Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-17 Thread Peter Xu
* locking requirements of exec(), migration skips > + * temporary VMAs until after exec() completes. > + */ > + if (!PageKsm(page) && PageAnon(page)) I think we can drop the PageAnon() check as it's just done above. I feel like this chunk was copied over from try_to_unmap(), however is that necessary? Is it possible that the caller of make_device_exclusive_range() pass in a temp stack vma during exec()? > + rwc.invalid_vma = invalid_migration_vma; > + > + rmap_walk(page, &rwc); > + > + return ttp.valid && !page_mapcount(page); > +} > + > +/** > + * make_device_exclusive_range() - Mark a range for exclusive use by a device > + * @mm: mm_struct of assoicated target process > + * @start: start of the region to mark for exclusive device access > + * @end: end address of region > + * @pages: returns the pages which were successfully marked for exclusive > access > + * @arg: passed to MMU_NOTIFY_EXCLUSIVE range notifier too allow filtering s/too/to/? > + * > + * Returns: number of pages successfully marked for exclusive access Hmm, I see that try_to_protect() can fail even if npages returned from GUP, so perhaps "number of pages successfully GUPed, however the page is marked for exclusive access only if the page pointer is non-NULL", or something like that? > + * > + * This function finds ptes mapping page(s) to the given address range, locks > + * them and replaces mappings with special swap entries preventing userspace > CPU s/userspace//? As same for kernel access? (I don't think I fully read all the codes in this patch, but I'll stop here for today..) Thanks, > + * access. On fault these entries are replaced with the original mapping > after > + * calling MMU notifiers. > + * > + * A driver using this to program access from a device must use a mmu > notifier > + * critical section to hold a device specific lock during programming. Once > + * programming is complete it should drop the page lock and reference after > + * which point CPU access to the page will revoke the exclusive access. > + */ > +int make_device_exclusive_range(struct mm_struct *mm, unsigned long start, > + unsigned long end, struct page **pages, > + void *arg) > +{ > + unsigned long npages = (end - start) >> PAGE_SHIFT; > + unsigned long i; > + > + npages = get_user_pages_remote(mm, start, npages, > +FOLL_GET | FOLL_WRITE | FOLL_SPLIT_PMD, > +pages, NULL, NULL); > + for (i = 0; i < npages; i++, start += PAGE_SIZE) { > + if (!trylock_page(pages[i])) { > + put_page(pages[i]); > + pages[i] = NULL; > + continue; > + } > + > + if (!try_to_protect(pages[i], mm, start, arg)) { > + unlock_page(pages[i]); > + put_page(pages[i]); > + pages[i] = NULL; > + } > + } > + > + return npages; > +} > +EXPORT_SYMBOL_GPL(make_device_exclusive_range); > + > void __put_anon_vma(struct anon_vma *anon_vma) > { > struct anon_vma *root = anon_vma->root; > -- > 2.20.1 > > -- Peter Xu

Re: [PATCH] mm: do not rely on mm == current->mm in __get_user_pages_locked

2020-09-28 Thread Peter Xu
d0 >kthread+0x114/0x130 >? kthread_create_worker_on_cpu+0x40/0x40 >ret_from_fork+0x1f/0x30 > CR2: 0064 > > This commit fixes the problem by using the mm pointer passed to the > function rather than the bogus one in current. > > Fixes: 008cfe44