Re: [PATCH 04/12] mm: Allow compound zone device pages

2024-09-09 Thread Matthew Wilcox
On Tue, Sep 10, 2024 at 02:14:29PM +1000, Alistair Popple wrote: > @@ -337,6 +341,7 @@ struct folio { > /* private: */ > }; > /* public: */ > + struct dev_pagemap *pgmap; Shouldn't that be indented by one more tab stop? And for ease of

[PATCH 11/12] mm: Remove pXX_devmap callers

2024-09-09 Thread Alistair Popple
The devmap PTE special bit was used to detect mappings of FS DAX pages. This tracking was required to ensure the generic mm did not manipulate the page reference counts as FS DAX implemented it's own reference counting scheme. Now that FS DAX pages have their references counted the same way as nor

[PATCH 12/12] mm: Remove devmap related functions and page table bits

2024-09-09 Thread Alistair Popple
Now that DAX and all other reference counts to ZONE_DEVICE pages are managed normally there is no need for the special devmap PTE/PMD/PUD page table bits. So drop all references to these, freeing up a software defined page table bit on architectures supporting it. Signed-off-by: Alistair Popple A

[PATCH 10/12] fs/dax: Properly refcount fs dax pages

2024-09-09 Thread Alistair Popple
Currently fs dax pages are considered free when the refcount drops to one and their refcounts are not increased when mapped via PTEs or decreased when unmapped. This requires special logic in mm paths to detect that these pages should not be properly refcounted, and to detect when the refcount drop

[PATCH 09/12] mm: Update vm_normal_page() callers to accept FS DAX pages

2024-09-09 Thread Alistair Popple
Currently if a PTE points to a FS DAX page vm_normal_page() will return NULL as these have their own special refcounting scheme. A future change will allow FS DAX pages to be refcounted the same as any other normal page. Therefore vm_normal_page() will start returning FS DAX pages. To avoid any ch

[PATCH 08/12] gup: Don't allow FOLL_LONGTERM pinning of FS DAX pages

2024-09-09 Thread Alistair Popple
Longterm pinning of FS DAX pages should already be disallowed by various pXX_devmap checks. However a future change will cause these checks to be invalid for FS DAX pages so make folio_is_longterm_pinnable() return false for FS DAX pages. Signed-off-by: Alistair Popple --- include/linux/memremap

[PATCH 04/12] mm: Allow compound zone device pages

2024-09-09 Thread Alistair Popple
Zone device pages are used to represent various type of device memory managed by device drivers. Currently compound zone device pages are not supported. This is because MEMORY_DEVICE_FS_DAX pages are the only user of higher order zone device pages and have their own page reference counting. A futu

[PATCH 07/12] huge_memory: Allow mappings of PMD sized pages

2024-09-09 Thread Alistair Popple
Currently DAX folio/page reference counts are managed differently to normal pages. To allow these to be managed the same as normal pages introduce dax_insert_pfn_pmd. This will map the entire PMD-sized folio and take references as it would for a normally mapped page. This is distinct from the curr

[PATCH 06/12] huge_memory: Allow mappings of PUD sized pages

2024-09-09 Thread Alistair Popple
Currently DAX folio/page reference counts are managed differently to normal pages. To allow these to be managed the same as normal pages introduce dax_insert_pfn_pud. This will map the entire PUD-sized folio and take references as it would for a normally mapped page. This is distinct from the curr

[PATCH 05/12] mm/memory: Add dax_insert_pfn

2024-09-09 Thread Alistair Popple
Currently to map a DAX page the DAX driver calls vmf_insert_pfn. This creates a special devmap PTE entry for the pfn but does not take a reference on the underlying struct page for the mapping. This is because DAX page refcounts are treated specially, as indicated by the presence of a devmap entry.

[PATCH 03/12] fs/dax: Refactor wait for dax idle page

2024-09-09 Thread Alistair Popple
A FS DAX page is considered idle when its refcount drops to one. This is currently open-coded in all file systems supporting FS DAX. Move the idle detection to a common function to make future changes easier. Signed-off-by: Alistair Popple Reviewed-by: Jan Kara Reviewed-by: Christoph Hellwig --

[PATCH 01/12] mm/gup.c: Remove redundant check for PCI P2PDMA page

2024-09-09 Thread Alistair Popple
PCI P2PDMA pages are not mapped with pXX_devmap PTEs therefore the check in __gup_device_huge() is redundant. Remove it Signed-off-by: Alistair Popple Reviewed-by: Jason Gunthorpe Acked-by: David Hildenbrand --- mm/gup.c | 5 - 1 file changed, 5 deletions(-) diff --git a/mm/gup.c b/mm/gup

[PATCH 02/12] pci/p2pdma: Don't initialise page refcount to one

2024-09-09 Thread Alistair Popple
The reference counts for ZONE_DEVICE private pages should be initialised by the driver when the page is actually allocated by the driver allocator, not when they are first created. This is currently the case for MEMORY_DEVICE_PRIVATE and MEMORY_DEVICE_COHERENT pages but not MEMORY_DEVICE_PCI_P2PDMA

[PATCH 00/12] fs/dax: Fix FS DAX page reference counts

2024-09-09 Thread Alistair Popple
Main updates since v1: - Now passes the same number of xfs_test with dax=always as without this series (some seem to fail on my setup normally). Thanks Dave for the suggestion as there were some deadlocks/crashes in v1 due to misshandling of write-protect faults and truncation which shou