Re: [PATCH v6 20/26] mm/mlock: Skip ZONE_DEVICE PMDs during mlock

2025-01-16 Thread Alistair Popple
On Mon, Jan 13, 2025 at 06:42:46PM -0800, Dan Williams wrote: > Alistair Popple wrote: > > At present mlock skips ptes mapping ZONE_DEVICE pages. A future change > > to remove pmd_devmap will allow pmd_trans_huge_lock() to return > > ZONE_DEVICE folios so make sure we c

Re: [PATCH v6 19/26] proc/task_mmu: Mark devdax and fsdax pages as always unpinned

2025-01-16 Thread Alistair Popple
On Tue, Jan 14, 2025 at 05:45:46PM +0100, David Hildenbrand wrote: > On 14.01.25 03:28, Dan Williams wrote: > > Alistair Popple wrote: > > > The procfs mmu files such as smaps and pagemap currently ignore devdax and > > > fsdax pages because these pages are consider

Re: [PATCH v6 11/26] mm: Allow compound zone device pages

2025-01-16 Thread Alistair Popple
On Tue, Jan 14, 2025 at 03:59:31PM +0100, David Hildenbrand wrote: > On 10.01.25 07:00, Alistair Popple wrote: > > Zone device pages are used to represent various type of device memory > > managed by device drivers. Currently compound zone device pages are > > not suppo

Re: [PATCH v6 08/26] fs/dax: Remove PAGE_MAPPING_DAX_SHARED mapping flag

2025-01-16 Thread Alistair Popple
On Tue, Jan 14, 2025 at 09:44:38PM -0800, Dan Williams wrote: > Alistair Popple wrote: > [..] > > > How does this case happen? I don't think any page would ever enter with > > > both ->mapping and ->share set, right? > > > > Sigh. You're right

Re: [PATCH v6 23/26] mm: Remove pXX_devmap callers

2025-01-14 Thread Alistair Popple
On Tue, Jan 14, 2025 at 10:50:49AM -0800, Dan Williams wrote: > Alistair Popple wrote: > > The devmap PTE special bit was used to detect mappings of FS DAX > > pages. This tracking was required to ensure the generic mm did not > > manipulate the page reference counts as FS DAX

Re: [PATCH v6 16/26] huge_memory: Add vmf_insert_folio_pmd()

2025-01-14 Thread Alistair Popple
On Tue, Jan 14, 2025 at 09:22:00AM -0800, Dan Williams wrote: > David Hildenbrand wrote: > > > +vm_fault_t vmf_insert_folio_pmd(struct vm_fault *vmf, struct folio > > > *folio, bool write) > > > +{ > > > + struct vm_area_struct *vma = vmf->vma; > > > + unsigned long addr = vmf->address & PMD_MASK;

Re: [PATCH v6 15/26] huge_memory: Add vmf_insert_folio_pud()

2025-01-14 Thread Alistair Popple
On Tue, Jan 14, 2025 at 05:22:15PM +0100, David Hildenbrand wrote: > On 10.01.25 07:00, Alistair Popple wrote: > > Currently DAX folio/page reference counts are managed differently to > > normal pages. To allow these to be managed the same as normal pages > > introduce vmf_i

Re: [PATCH v6 13/26] mm/memory: Add vmf_insert_page_mkwrite()

2025-01-14 Thread Alistair Popple
On Tue, Jan 14, 2025 at 05:15:54PM +0100, David Hildenbrand wrote: > On 10.01.25 07:00, Alistair Popple wrote: > > Currently to map a DAX page the DAX driver calls vmf_insert_pfn. This > > creates a special devmap PTE entry for the pfn but does not take a > > reference on

Re: [PATCH v6 12/26] mm/memory: Enhance insert_page_into_pte_locked() to create writable mappings

2025-01-14 Thread Alistair Popple
On Mon, Jan 13, 2025 at 05:08:31PM -0800, Dan Williams wrote: > Alistair Popple wrote: > > In preparation for using insert_page() for DAX, enhance > > insert_page_into_pte_locked() to handle establishing writable > > mappings. Recall that DAX returns VM_FAULT_NOPAGE after

Re: [PATCH v6 08/26] fs/dax: Remove PAGE_MAPPING_DAX_SHARED mapping flag

2025-01-14 Thread Alistair Popple
On Mon, Jan 13, 2025 at 04:52:34PM -0800, Dan Williams wrote: > Alistair Popple wrote: > > PAGE_MAPPING_DAX_SHARED is the same as PAGE_MAPPING_ANON. > > I think a bit a bit more detail is warranted, how about? > > The page ->mapping pointer can have magic values like &

Re: [PATCH v6 07/26] fs/dax: Ensure all pages are idle prior to filesystem unmount

2025-01-12 Thread Alistair Popple
On Sun, Jan 12, 2025 at 06:49:40PM -0800, Darrick J. Wong wrote: > On Mon, Jan 13, 2025 at 11:57:18AM +1100, Alistair Popple wrote: > > On Fri, Jan 10, 2025 at 08:50:19AM -0800, Darrick J. Wong wrote: > > > On Fri, Jan 10, 2025 at 05:00:35PM +1100, Alistair Popple wrote: > &g

Re: [PATCH v6 21/26] fs/dax: Properly refcount fs dax pages

2025-01-12 Thread Alistair Popple
On Fri, Jan 10, 2025 at 08:54:55AM -0800, Darrick J. Wong wrote: > On Fri, Jan 10, 2025 at 05:00:49PM +1100, Alistair Popple wrote: > > Currently fs dax pages are considered free when the refcount drops to > > one and their refcounts are not increased when mapped via PTEs or >

Re: [PATCH v6 00/26] fs/dax: Fix ZONE_DEVICE page reference counts

2025-01-12 Thread Alistair Popple
On Fri, Jan 10, 2025 at 07:35:57PM -0800, Dan Williams wrote: > Andrew Morton wrote: > > On Thu, 9 Jan 2025 23:05:56 -0800 Dan Williams > > wrote: > > > > > > - Remove PTE_DEVMAP definitions from Loongarch which were added since > > > >this series was initially written. > > > [..] > > > >

Re: [PATCH v6 07/26] fs/dax: Ensure all pages are idle prior to filesystem unmount

2025-01-12 Thread Alistair Popple
On Fri, Jan 10, 2025 at 08:50:19AM -0800, Darrick J. Wong wrote: > On Fri, Jan 10, 2025 at 05:00:35PM +1100, Alistair Popple wrote: > > File systems call dax_break_mapping() prior to reallocating file > > system blocks to ensure the page is not undergoing any DMA or other > >

Re: [PATCH v6 05/26] fs/dax: Create a common implementation to break DAX layouts

2025-01-12 Thread Alistair Popple
On Fri, Jan 10, 2025 at 08:44:38AM -0800, Darrick J. Wong wrote: > On Fri, Jan 10, 2025 at 05:00:33PM +1100, Alistair Popple wrote: > > Prior to freeing a block file systems supporting FS DAX must check > > that the associated pages are both unmapped from user-space and not > &

Re: [PATCH v5 00/25] fs/dax: Fix ZONE_DEVICE page reference counts

2025-01-09 Thread Alistair Popple
On Wed, Jan 08, 2025 at 05:34:30PM -0800, Alison Schofield wrote: > On Tue, Jan 07, 2025 at 02:42:16PM +1100, Alistair Popple wrote: > > Main updates since v4: > > > > - Removed most of the devdax/fsdax checks in fs/proc/task_mmu.c. This > >means smaps/pa

[PATCH v6 25/26] Revert "riscv: mm: Add support for ZONE_DEVICE"

2025-01-09 Thread Alistair Popple
DEVMAP PTEs are no longer required to support ZONE_DEVICE so remove them. Signed-off-by: Alistair Popple Suggested-by: Chunyan Zhang Reviewed-by: Björn Töpel --- arch/riscv/Kconfig| 1 - arch/riscv/include/asm/pgtable-64.h | 20 arch/riscv/include

[PATCH v6 26/26] Revert "LoongArch: Add ARCH_HAS_PTE_DEVMAP support"

2025-01-09 Thread Alistair Popple
DEVMAP PTEs are no longer required to support ZONE_DEVICE so remove them. Signed-off-by: Alistair Popple --- arch/loongarch/Kconfig| 1 - arch/loongarch/include/asm/pgtable-bits.h | 6 ++ arch/loongarch/include/asm/pgtable.h | 19 --- 3 files

[PATCH v6 24/26] mm: Remove devmap related functions and page table bits

2025-01-09 Thread Alistair Popple
Now that DAX and all other reference counts to ZONE_DEVICE pages are managed normally there is no need for the special devmap PTE/PMD/PUD page table bits. So drop all references to these, freeing up a software defined page table bit on architectures supporting it. Signed-off-by: Alistair Popple

[PATCH v6 23/26] mm: Remove pXX_devmap callers

2025-01-09 Thread Alistair Popple
->lru with page->pgmap. Signed-off-by: Alistair Popple --- arch/powerpc/mm/book3s64/hash_hugepage.c | 2 +- arch/powerpc/mm/book3s64/hash_pgtable.c | 3 +- arch/powerpc/mm/book3s64/hugetlbpage.c | 2 +- arch/powerpc/mm/book3s64/pgtable.c | 10 +- arch/powerpc/mm/book3s64

[PATCH v6 22/26] device/dax: Properly refcount device dax pages when mapping

2025-01-09 Thread Alistair Popple
t the pages normally as defined by vm_normal_page(). Signed-off-by: Alistair Popple --- drivers/dax/device.c | 15 +-- mm/memremap.c| 13 ++--- 2 files changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 6d74e62..fd

[PATCH v6 21/26] fs/dax: Properly refcount fs dax pages

2025-01-09 Thread Alistair Popple
to remove the pgmap refcounting that is currently done in mm/gup.c. Signed-off-by: Alistair Popple --- Changes since v2: Based on some questions from Dan I attempted to have the FS DAX page cache (ie. address space) hold a reference to the folio whilst it was mapped. However I came to the

[PATCH v6 18/26] mm/gup: Don't allow FOLL_LONGTERM pinning of FS DAX pages

2025-01-09 Thread Alistair Popple
Longterm pinning of FS DAX pages should already be disallowed by various pXX_devmap checks. However a future change will cause these checks to be invalid for FS DAX pages so make folio_is_longterm_pinnable() return false for FS DAX pages. Signed-off-by: Alistair Popple Reviewed-by: John Hubbard

[PATCH v6 20/26] mm/mlock: Skip ZONE_DEVICE PMDs during mlock

2025-01-09 Thread Alistair Popple
At present mlock skips ptes mapping ZONE_DEVICE pages. A future change to remove pmd_devmap will allow pmd_trans_huge_lock() to return ZONE_DEVICE folios so make sure we continue to skip those. Signed-off-by: Alistair Popple Acked-by: David Hildenbrand --- mm/mlock.c | 2 ++ 1 file changed, 2

[PATCH v6 19/26] proc/task_mmu: Mark devdax and fsdax pages as always unpinned

2025-01-09 Thread Alistair Popple
never be pinned for DMA via FOLL_LONGTERM, so add an explicit check in pte_is_pinned() to reflect that. Signed-off-by: Alistair Popple --- Changes for v5: - After discussion with David remove the checks for DAX pages for smaps and pagemap walkers. This means DAX pages will now appear in

[PATCH v6 17/26] memremap: Add is_devdax_page() and is_fsdax_page() helpers

2025-01-09 Thread Alistair Popple
Add helpers to determine if a page or folio is a devdax or fsdax page or folio. Signed-off-by: Alistair Popple Acked-by: David Hildenbrand --- Changes for v5: - Renamed is_device_dax_page() to is_devdax_page() for consistency. --- include/linux/memremap.h | 22 ++ 1 file

[PATCH v6 14/26] rmap: Add support for PUD sized mappings to rmap

2025-01-09 Thread Alistair Popple
PUD-sized folios so we don't support for that for now. Signed-off-by: Alistair Popple Acked-by: David Hildenbrand --- Changes for v6: - Minor comment formatting fix - Add an additional check for CONFIG_TRANSPARENT_HUGEPAGE to fix a build breakage when CONFIG_PGTABLE_HAS_HUGE_LEAVES

[PATCH v6 16/26] huge_memory: Add vmf_insert_folio_pmd()

2025-01-09 Thread Alistair Popple
current mechanism, vmf_insert_pfn_pmd, which simply inserts a special devmap PMD entry into the page table without holding a reference to the page for the mapping. Signed-off-by: Alistair Popple --- Changes for v5: - Minor code cleanup suggested by David --- include/linux/huge_mm.h | 1 +- mm

[PATCH v6 15/26] huge_memory: Add vmf_insert_folio_pud()

2025-01-09 Thread Alistair Popple
current mechanism, vmf_insert_pfn_pud, which simply inserts a special devmap PUD entry into the page table without holding a reference to the page for the mapping. Signed-off-by: Alistair Popple --- Changes for v5: - Removed is_huge_zero_pud() as it's unlikely to ever be implemented. -

[PATCH v6 13/26] mm/memory: Add vmf_insert_page_mkwrite()

2025-01-09 Thread Alistair Popple
-off-by: Alistair Popple --- Updates from v2: - Rename function to make not DAX specific - Split the insert_page_into_pte_locked() change into a separate patch. Updates from v1: - Re-arrange code in insert_page_into_pte_locked() based on comments from Jan Kara. - Call mkdrity/mkyoung

[PATCH v6 12/26] mm/memory: Enhance insert_page_into_pte_locked() to create writable mappings

2025-01-09 Thread Alistair Popple
In preparation for using insert_page() for DAX, enhance insert_page_into_pte_locked() to handle establishing writable mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a PTE which bypasses the typical set_pte_range() in finish_fault. Signed-off-by: Alistair Popple Suggested-by

[PATCH v6 11/26] mm: Allow compound zone device pages

2025-01-09 Thread Alistair Popple
>pgmap. The page->pgmap field is common to all pages within a memory section. Therefore pgmap is the same for both head and tail pages and can be moved into the folio and we can use the standard scheme to find compound_head from a tail page. Signed-off-by: Alistair Popple Reviewed-by: Jas

[PATCH v6 10/26] mm/mm_init: Move p2pdma page refcount initialisation to p2pdma

2025-01-09 Thread Alistair Popple
refcount as required. P2PDMA uses vm_insert_page() to map the page, and that requires a non-zero reference count when initialising the page so set that when the page is first mapped. Signed-off-by: Alistair Popple Reviewed-by: Dan Williams --- Changes since v2: - Initialise the page refcount

[PATCH v6 08/26] fs/dax: Remove PAGE_MAPPING_DAX_SHARED mapping flag

2025-01-09 Thread Alistair Popple
ed. The page is considered shared when page->mapping == NULL and page->share > 0 or page->mapping != NULL, implying it is present in at least one address space. This also makes it easier for a future change to detect when a page is first mapped into an address space which requires spe

[PATCH v6 09/26] mm/gup: Remove redundant check for PCI P2PDMA page

2025-01-09 Thread Alistair Popple
PCI P2PDMA pages are not mapped with pXX_devmap PTEs therefore the check in __gup_device_huge() is redundant. Remove it Signed-off-by: Alistair Popple Reviewed-by: Jason Gunthorpe Reviewed-by: Dan Wiliams Acked-by: David Hildenbrand --- mm/gup.c | 5 - 1 file changed, 5 deletions

[PATCH v6 07/26] fs/dax: Ensure all pages are idle prior to filesystem unmount

2025-01-09 Thread Alistair Popple
ystem block to be freed will not wait for the remote access to complete. Therefore a busy block may be reallocated to a new file leading to corruption. Signed-off-by: Alistair Popple --- Changes for v5: - Don't wait for pages to be idle in non-DAX mappings --- fs/dax.c

[PATCH v6 06/26] fs/dax: Always remove DAX page-cache entries when breaking layouts

2025-01-09 Thread Alistair Popple
when the file-system calls dax_break_mapping() as part of it's truncate operation. This ensures only idle pages can be removed from the FS DAX page-cache and makes it easy to detect if a file-system hasn't called dax_break_mapping() prior to a truncate operation. Signed-off-by:

[PATCH v6 05/26] fs/dax: Create a common implementation to break DAX layouts

2025-01-09 Thread Alistair Popple
: Alistair Popple --- Changes for v5: - Don't wait for idle pages on non-DAX mappings Changes for v4: - Fixed some build breakage due to missing symbol exports reported by John Hubbard (thanks!). --- fs/dax.c| 33 + fs/ext4/inode.c

[PATCH v6 04/26] fs/dax: Refactor wait for dax idle page

2025-01-09 Thread Alistair Popple
A FS DAX page is considered idle when its refcount drops to one. This is currently open-coded in all file systems supporting FS DAX. Move the idle detection to a common function to make future changes easier. Signed-off-by: Alistair Popple Reviewed-by: Jan Kara Reviewed-by: Christoph Hellwig

[PATCH v6 03/26] fs/dax: Don't skip locked entries when scanning entries

2025-01-09 Thread Alistair Popple
to make it clear that it may advance the iterator state. Signed-off-by: Alistair Popple Reviewed-by: Dan Williams --- fs/dax.c | 50 +- 1 file changed, 41 insertions(+), 9 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 5133568..d010c10 1006

[PATCH v6 02/26] fs/dax: Return unmapped busy pages from dax_layout_busy_page_range()

2025-01-09 Thread Alistair Popple
user-space with mapping_mapped() and returns early if not, skipping the check for DMA busy pages. This is wrong as pages may still be undergoing DMA access even if they have subsequently been unmapped from user-space. Fix this by dropping the check for mapping_mapped(). Signed-off-by: Alistair

[PATCH v6 01/26] fuse: Fix dax truncate/punch_hole fault path

2025-01-09 Thread Alistair Popple
to fuse_dax_break_layouts() which will invalidate the entire file range to dax_layout_busy_page_range(). Signed-off-by: Alistair Popple Co-developed-by: Dan Williams Signed-off-by: Dan Williams Fixes: 6ae330cad6ef ("virtiofs: serialize truncate/punch_hole and dax fault path") Cc: V

[PATCH v6 00/26] fs/dax: Fix ZONE_DEVICE page reference counts

2025-01-09 Thread Alistair Popple
ainly allows further clean-up of the devmap managed functions, but I have left that as a future improvment. It also enables support for compound ZONE_DEVICE pages which is one of my primary motivators for doing this work. Signed-off-by: Alistair Popple Tested-by: Alison Schofield --- Cc: l...@asahi

Re: [PATCH v5 05/25] fs/dax: Create a common implementation to break DAX layouts

2025-01-08 Thread Alistair Popple
On Wed, Jan 08, 2025 at 04:14:20PM -0800, Dan Williams wrote: > Alistair Popple wrote: > > Prior to freeing a block file systems supporting FS DAX must check > > that the associated pages are both unmapped from user-space and not > > undergoing DMA or other access from eg. g

Re: [PATCH v5 03/25] fs/dax: Don't skip locked entries when scanning entries

2025-01-08 Thread Alistair Popple
On Wed, Jan 08, 2025 at 02:50:36PM -0800, Dan Williams wrote: > Alistair Popple wrote: > > Several functions internal to FS DAX use the following pattern when > > trying to obtain an unlocked entry: > > > > xas_for_each(&xas, entry, end_idx) {

Re: [PATCH v5 01/25] fuse: Fix dax truncate/punch_hole fault path

2025-01-08 Thread Alistair Popple
On Wed, Jan 08, 2025 at 02:30:24PM -0800, Dan Williams wrote: > Alistair Popple wrote: > > FS DAX requires file systems to call into the DAX layout prior to > > unlinking inodes to ensure there is no ongoing DMA or other remote > > access to the direct mapped page. The fuse f

[PATCH v5 14/25] rmap: Add support for PUD sized mappings to rmap

2025-01-06 Thread Alistair Popple
PUD-sized folios so we don't support for that for now. Signed-off-by: Alistair Popple --- Changes for v5: - Fixed accounting as suggested by David. Changes for v4: - New for v4, split out rmap changes as suggested by David. --- include/linux/rmap.h | 15 ++- mm/rmap.c

[PATCH v5 15/25] huge_memory: Add vmf_insert_folio_pud()

2025-01-06 Thread Alistair Popple
current mechanism, vmf_insert_pfn_pud, which simply inserts a special devmap PUD entry into the page table without holding a reference to the page for the mapping. Signed-off-by: Alistair Popple --- Changes for v5: - Removed is_huge_zero_pud() as it's unlikely to ever be implemented. -

[PATCH v5 04/25] fs/dax: Refactor wait for dax idle page

2025-01-06 Thread Alistair Popple
A FS DAX page is considered idle when its refcount drops to one. This is currently open-coded in all file systems supporting FS DAX. Move the idle detection to a common function to make future changes easier. Signed-off-by: Alistair Popple Reviewed-by: Jan Kara Reviewed-by: Christoph Hellwig

[PATCH v5 03/25] fs/dax: Don't skip locked entries when scanning entries

2025-01-06 Thread Alistair Popple
to make it clear that it may advance the iterator state. Signed-off-by: Alistair Popple --- fs/dax.c | 50 +- 1 file changed, 41 insertions(+), 9 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 5133568..d010c10 100644 --- a/fs/dax.c +++ b/

[PATCH v5 25/25] Revert "riscv: mm: Add support for ZONE_DEVICE"

2025-01-06 Thread Alistair Popple
DEVMAP PTEs are no longer required to support ZONE_DEVICE so remove them. Signed-off-by: Alistair Popple Suggested-by: Chunyan Zhang Reviewed-by: Björn Töpel --- arch/riscv/Kconfig| 1 - arch/riscv/include/asm/pgtable-64.h | 20 arch/riscv/include

[PATCH v5 24/25] mm: Remove devmap related functions and page table bits

2025-01-06 Thread Alistair Popple
Now that DAX and all other reference counts to ZONE_DEVICE pages are managed normally there is no need for the special devmap PTE/PMD/PUD page table bits. So drop all references to these, freeing up a software defined page table bit on architectures supporting it. Signed-off-by: Alistair Popple

[PATCH v5 23/25] mm: Remove pXX_devmap callers

2025-01-06 Thread Alistair Popple
->lru with page->pgmap. Signed-off-by: Alistair Popple --- arch/powerpc/mm/book3s64/hash_pgtable.c | 3 +- arch/powerpc/mm/book3s64/pgtable.c | 8 +- arch/powerpc/mm/book3s64/radix_pgtable.c | 5 +- arch/powerpc/mm/pgtable.c|

[PATCH v5 21/25] fs/dax: Properly refcount fs dax pages

2025-01-06 Thread Alistair Popple
to remove the pgmap refcounting that is currently done in mm/gup.c. Signed-off-by: Alistair Popple --- Changes since v2: Based on some questions from Dan I attempted to have the FS DAX page cache (ie. address space) hold a reference to the folio whilst it was mapped. However I came to the

[PATCH v5 22/25] device/dax: Properly refcount device dax pages when mapping

2025-01-06 Thread Alistair Popple
t the pages normally as defined by vm_normal_page(). Signed-off-by: Alistair Popple --- drivers/dax/device.c | 15 +-- mm/memremap.c| 13 ++--- 2 files changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 6d74e62..fd

[PATCH v5 20/25] mm/mlock: Skip ZONE_DEVICE PMDs during mlock

2025-01-06 Thread Alistair Popple
At present mlock skips ptes mapping ZONE_DEVICE pages. A future change to remove pmd_devmap will allow pmd_trans_huge_lock() to return ZONE_DEVICE folios so make sure we continue to skip those. Signed-off-by: Alistair Popple Acked-by: David Hildenbrand --- mm/mlock.c | 2 ++ 1 file changed, 2

[PATCH v5 19/25] proc/task_mmu: Mark devdax and fsdax pages as always unpinned

2025-01-06 Thread Alistair Popple
never be pinned for DMA via FOLL_LONGTERM, so add an explicit check in pte_is_pinned() to reflect that. Signed-off-by: Alistair Popple --- Changes for v5: - After discussion with David remove the checks for DAX pages for smaps and pagemap walkers. This means DAX pages will now appear in

[PATCH v5 18/25] mm/gup: Don't allow FOLL_LONGTERM pinning of FS DAX pages

2025-01-06 Thread Alistair Popple
Longterm pinning of FS DAX pages should already be disallowed by various pXX_devmap checks. However a future change will cause these checks to be invalid for FS DAX pages so make folio_is_longterm_pinnable() return false for FS DAX pages. Signed-off-by: Alistair Popple Reviewed-by: John Hubbard

[PATCH v5 17/25] memremap: Add is_devdax_page() and is_fsdax_page() helpers

2025-01-06 Thread Alistair Popple
Add helpers to determine if a page or folio is a devdax or fsdax page or folio. Signed-off-by: Alistair Popple Acked-by: David Hildenbrand --- Changes for v5: - Renamed is_device_dax_page() to is_devdax_page() for consistency. --- include/linux/memremap.h | 22 ++ 1 file

[PATCH v5 16/25] huge_memory: Add vmf_insert_folio_pmd()

2025-01-06 Thread Alistair Popple
current mechanism, vmf_insert_pfn_pmd, which simply inserts a special devmap PMD entry into the page table without holding a reference to the page for the mapping. Signed-off-by: Alistair Popple --- Changes for v5: - Minor code cleanup suggested by David --- include/linux/huge_mm.h | 1 +- mm

[PATCH v5 13/25] mm/memory: Add vmf_insert_page_mkwrite()

2025-01-06 Thread Alistair Popple
-off-by: Alistair Popple --- Updates from v2: - Rename function to make not DAX specific - Split the insert_page_into_pte_locked() change into a separate patch. Updates from v1: - Re-arrange code in insert_page_into_pte_locked() based on comments from Jan Kara. - Call mkdrity/mkyoung

[PATCH v5 12/25] mm/memory: Enhance insert_page_into_pte_locked() to create writable mappings

2025-01-06 Thread Alistair Popple
In preparation for using insert_page() for DAX, enhance insert_page_into_pte_locked() to handle establishing writable mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a PTE which bypasses the typical set_pte_range() in finish_fault. Signed-off-by: Alistair Popple Suggested-by

[PATCH v5 11/25] mm: Allow compound zone device pages

2025-01-06 Thread Alistair Popple
>pgmap. The page->pgmap field is common to all pages within a memory section. Therefore pgmap is the same for both head and tail pages and can be moved into the folio and we can use the standard scheme to find compound_head from a tail page. Signed-off-by: Alistair Popple Reviewed-by: Jas

[PATCH v5 10/25] mm/mm_init: Move p2pdma page refcount initialisation to p2pdma

2025-01-06 Thread Alistair Popple
refcount as required. P2PDMA uses vm_insert_page() to map the page, and that requires a non-zero reference count when initialising the page so set that when the page is first mapped. Signed-off-by: Alistair Popple Reviewed-by: Dan Williams --- Changes since v2: - Initialise the page refcount

[PATCH v5 09/25] mm/gup: Remove redundant check for PCI P2PDMA page

2025-01-06 Thread Alistair Popple
PCI P2PDMA pages are not mapped with pXX_devmap PTEs therefore the check in __gup_device_huge() is redundant. Remove it Signed-off-by: Alistair Popple Reviewed-by: Jason Gunthorpe Reviewed-by: Dan Wiliams Acked-by: David Hildenbrand --- mm/gup.c | 5 - 1 file changed, 5 deletions

[PATCH v5 08/25] fs/dax: Remove PAGE_MAPPING_DAX_SHARED mapping flag

2025-01-06 Thread Alistair Popple
ed. The page is considered shared when page->mapping == NULL and page->share > 0 or page->mapping != NULL, implying it is present in at least one address space. This also makes it easier for a future change to detect when a page is first mapped into an address space which requires spe

[PATCH v5 07/25] fs/dax: Ensure all pages are idle prior to filesystem unmount

2025-01-06 Thread Alistair Popple
ystem block to be freed will not wait for the remote access to complete. Therefore a busy block may be reallocated to a new file leading to corruption. Signed-off-by: Alistair Popple --- Changes for v5: - Don't wait for pages to be idle in non-DAX mappings --- fs/dax.c

[PATCH v5 06/25] fs/dax: Always remove DAX page-cache entries when breaking layouts

2025-01-06 Thread Alistair Popple
when the file-system calls dax_break_mapping() as part of it's truncate operation. This ensures only idle pages can be removed from the FS DAX page-cache and makes it easy to detect if a file-system hasn't called dax_break_mapping() prior to a truncate operation. Signed-off-by:

[PATCH v5 05/25] fs/dax: Create a common implementation to break DAX layouts

2025-01-06 Thread Alistair Popple
: Alistair Popple --- Changes for v5: - Don't wait for idle pages on non-DAX mappings Changes for v4: - Fixed some build breakage due to missing symbol exports reported by John Hubbard (thanks!). --- fs/dax.c| 33 + fs/ext4/inode.c

[PATCH v5 02/25] fs/dax: Return unmapped busy pages from dax_layout_busy_page_range()

2025-01-06 Thread Alistair Popple
user-space with mapping_mapped() and returns early if not, skipping the check for DMA busy pages. This is wrong as pages may still be undergoing DMA access even if they have subsequently been unmapped from user-space. Fix this by dropping the check for mapping_mapped(). Signed-off-by: Alistair

[PATCH v5 01/25] fuse: Fix dax truncate/punch_hole fault path

2025-01-06 Thread Alistair Popple
== 0 in fuse_dax_break_layouts() and pass the entire file range to dax_layout_busy_page_range(). Signed-off-by: Alistair Popple Fixes: 6ae330cad6ef ("virtiofs: serialize truncate/punch_hole and dax fault path") Cc: Vivek Goyal --- I am not at all familiar with the fuse file system d

[PATCH v5 00/25] fs/dax: Fix ZONE_DEVICE page reference counts

2025-01-06 Thread Alistair Popple
p of the devmap managed functions, but I have left that as a future improvment. It also enables support for compound ZONE_DEVICE pages which is one of my primary motivators for doing this work. Signed-off-by: Alistair Popple --- Cc: l...@asahilina.net Cc: zhang.l...@gmail.com Cc: gerald.schae...@l

Re: [PATCH v4 19/25] proc/task_mmu: Ignore ZONE_DEVICE pages

2025-01-05 Thread Alistair Popple
On Fri, Dec 20, 2024 at 07:32:52PM +0100, David Hildenbrand wrote: > On 19.12.24 00:11, Alistair Popple wrote: > > On Tue, Dec 17, 2024 at 11:31:25PM +0100, David Hildenbrand wrote: > > > On 17.12.24 06:13, Alistair Popple wrote: > > > > The procfs mmu files such as

Re: [PATCH v4 15/25] huge_memory: Add vmf_insert_folio_pud()

2025-01-05 Thread Alistair Popple
On Fri, Dec 20, 2024 at 07:52:43PM +0100, David Hildenbrand wrote: > On 17.12.24 06:12, Alistair Popple wrote: > > Currently DAX folio/page reference counts are managed differently to > > normal pages. To allow these to be managed the same as normal pages > > introduce vmf_i

Re: [PATCH v4 12/25] mm/memory: Enhance insert_page_into_pte_locked() to create writable mappings

2025-01-05 Thread Alistair Popple
On Fri, Dec 20, 2024 at 08:06:48PM +0100, David Hildenbrand wrote: > On 20.12.24 20:01, David Hildenbrand wrote: > > On 17.12.24 06:12, Alistair Popple wrote: > > > In preparation for using insert_page() for DAX, enhance > > > insert_page_into_pte_locked() to h

Re: [PATCH v4 19/25] proc/task_mmu: Ignore ZONE_DEVICE pages

2024-12-18 Thread Alistair Popple
On Tue, Dec 17, 2024 at 11:31:25PM +0100, David Hildenbrand wrote: > On 17.12.24 06:13, Alistair Popple wrote: > > The procfs mmu files such as smaps currently ignore device dax and fs > > dax pages because these pages are considered special. To maintain > > existing behaviour

Re: [PATCH v4 14/25] rmap: Add support for PUD sized mappings to rmap

2024-12-18 Thread Alistair Popple
On Tue, Dec 17, 2024 at 11:27:13PM +0100, David Hildenbrand wrote: > On 17.12.24 06:12, Alistair Popple wrote: > > The rmap doesn't currently support adding a PUD mapping of a > > folio. This patch adds support for entire PUD mappings of folios, > > primarily to allow for

Re: [PATCH v4 10/25] mm/mm_init: Move p2pdma page refcount initialisation to p2pdma

2024-12-18 Thread Alistair Popple
On Tue, Dec 17, 2024 at 11:14:42PM +0100, David Hildenbrand wrote: > On 17.12.24 06:12, Alistair Popple wrote: > > Currently ZONE_DEVICE page reference counts are initialised by core > > memory management code in __init_zone_device_page() as part of the > > memremap() call

[PATCH v4 07/25] fs/dax: Ensure all pages are idle prior to filesystem unmount

2024-12-16 Thread Alistair Popple
ystem block to be freed will not wait for the remote access to complete. Therefore a busy block may be reallocated to a new file leading to corruption. Signed-off-by: Alistair Popple --- fs/dax.c| 26 ++ fs/ext4/inode.c | 32 ++-

Re: [PATCH v3 00/25] fs/dax: Fix ZONE_DEVICE page reference counts

2024-12-16 Thread Alistair Popple
On Sun, Dec 15, 2024 at 10:26:55PM -0800, Andrew Morton wrote: > On Mon, 16 Dec 2024 11:55:30 +1100 Alistair Popple wrote: > > > The remainder are more -mm focussed. However they do depend on the fs/dax > > cleanups in the first half so the trick would be making sure Andr

[PATCH v4 25/25] Revert "riscv: mm: Add support for ZONE_DEVICE"

2024-12-16 Thread Alistair Popple
DEVMAP PTEs are no longer required to support ZONE_DEVICE so remove them. Signed-off-by: Alistair Popple Suggested-by: Chunyan Zhang Reviewed-by: Björn Töpel --- arch/riscv/Kconfig| 1 - arch/riscv/include/asm/pgtable-64.h | 20 arch/riscv/include

[PATCH v4 24/25] mm: Remove devmap related functions and page table bits

2024-12-16 Thread Alistair Popple
Now that DAX and all other reference counts to ZONE_DEVICE pages are managed normally there is no need for the special devmap PTE/PMD/PUD page table bits. So drop all references to these, freeing up a software defined page table bit on architectures supporting it. Signed-off-by: Alistair Popple

[PATCH v4 23/25] mm: Remove pXX_devmap callers

2024-12-16 Thread Alistair Popple
->lru with page->pgmap. Signed-off-by: Alistair Popple --- arch/powerpc/mm/book3s64/hash_pgtable.c | 3 +- arch/powerpc/mm/book3s64/pgtable.c | 8 +- arch/powerpc/mm/book3s64/radix_pgtable.c | 5 +- arch/powerpc/mm/pgtable.c|

[PATCH v4 22/25] device/dax: Properly refcount device dax pages when mapping

2024-12-16 Thread Alistair Popple
t the pages normally as defined by vm_normal_page(). Signed-off-by: Alistair Popple --- drivers/dax/device.c | 15 +-- mm/memremap.c| 13 ++--- 2 files changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 6d74e62..fd

[PATCH v4 21/25] fs/dax: Properly refcount fs dax pages

2024-12-16 Thread Alistair Popple
to remove the pgmap refcounting that is currently done in mm/gup.c. Signed-off-by: Alistair Popple --- Changes since v2: Based on some questions from Dan I attempted to have the FS DAX page cache (ie. address space) hold a reference to the folio whilst it was mapped. However I came to the

[PATCH v4 20/25] mm/mlock: Skip ZONE_DEVICE PMDs during mlock

2024-12-16 Thread Alistair Popple
At present mlock skips ptes mapping ZONE_DEVICE pages. A future change to remove pmd_devmap will allow pmd_trans_huge_lock() to return ZONE_DEVICE folios so make sure we continue to skip those. Signed-off-by: Alistair Popple --- mm/mlock.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a

[PATCH v4 19/25] proc/task_mmu: Ignore ZONE_DEVICE pages

2024-12-16 Thread Alistair Popple
The procfs mmu files such as smaps currently ignore device dax and fs dax pages because these pages are considered special. To maintain existing behaviour once these pages are treated as normal pages and returned from vm_normal_page() add tests to explicitly skip them. Signed-off-by: Alistair

[PATCH v4 18/25] gup: Don't allow FOLL_LONGTERM pinning of FS DAX pages

2024-12-16 Thread Alistair Popple
Longterm pinning of FS DAX pages should already be disallowed by various pXX_devmap checks. However a future change will cause these checks to be invalid for FS DAX pages so make folio_is_longterm_pinnable() return false for FS DAX pages. Signed-off-by: Alistair Popple Reviewed-by: John Hubbard

[PATCH v4 17/25] memremap: Add is_device_dax_page() and is_fsdax_page() helpers

2024-12-16 Thread Alistair Popple
Add helpers to determine if a page or folio is a device dax or fs dax page or folio. Signed-off-by: Alistair Popple --- include/linux/memremap.h | 22 ++ 1 file changed, 22 insertions(+) diff --git a/include/linux/memremap.h b/include/linux/memremap.h index 0256a42..f2a8d13

[PATCH v4 16/25] huge_memory: Add vmf_insert_folio_pmd()

2024-12-16 Thread Alistair Popple
current mechanism, vmf_insert_pfn_pmd, which simply inserts a special devmap PMD entry into the page table without holding a reference to the page for the mapping. Signed-off-by: Alistair Popple --- include/linux/huge_mm.h | 1 +- mm/huge_memory.c| 60

[PATCH v4 15/25] huge_memory: Add vmf_insert_folio_pud()

2024-12-16 Thread Alistair Popple
current mechanism, vmf_insert_pfn_pud, which simply inserts a special devmap PUD entry into the page table without holding a reference to the page for the mapping. Signed-off-by: Alistair Popple --- include/linux/huge_mm.h | 11 +- mm/huge_memory.c| 96

[PATCH v4 14/25] rmap: Add support for PUD sized mappings to rmap

2024-12-16 Thread Alistair Popple
PUD-sized folios so we don't support for that for now. Signed-off-by: Alistair Popple --- David - Thanks for your previous comments, I'm less familiar with the rmap code so I would appreciate you taking another look. In particular I haven't added a stat for PUD mapped folios a

[PATCH v4 12/25] mm/memory: Enhance insert_page_into_pte_locked() to create writable mappings

2024-12-16 Thread Alistair Popple
In preparation for using insert_page() for DAX, enhance insert_page_into_pte_locked() to handle establishing writable mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a PTE which bypasses the typical set_pte_range() in finish_fault. Signed-off-by: Alistair Popple Suggested-by

[PATCH v4 13/25] mm/memory: Add vmf_insert_page_mkwrite()

2024-12-16 Thread Alistair Popple
-off-by: Alistair Popple --- Updates from v2: - Rename function to make not DAX specific - Split the insert_page_into_pte_locked() change into a separate patch. Updates from v1: - Re-arrange code in insert_page_into_pte_locked() based on comments from Jan Kara. - Call mkdrity/mkyoung

[PATCH v4 11/25] mm: Allow compound zone device pages

2024-12-16 Thread Alistair Popple
>pgmap. The page->pgmap field is common to all pages within a memory section. Therefore pgmap is the same for both head and tail pages and can be moved into the folio and we can use the standard scheme to find compound_head from a tail page. Signed-off-by: Alistair Popple Reviewed-by: Jas

[PATCH v4 10/25] mm/mm_init: Move p2pdma page refcount initialisation to p2pdma

2024-12-16 Thread Alistair Popple
refcount as required. P2PDMA uses vm_insert_page() to map the page, and that requires a non-zero reference count when initialising the page so set that when the page is first mapped. Signed-off-by: Alistair Popple Reviewed-by: Dan Williams --- Changes since v2: - Initialise the page refcount

[PATCH v4 09/25] mm/gup.c: Remove redundant check for PCI P2PDMA page

2024-12-16 Thread Alistair Popple
PCI P2PDMA pages are not mapped with pXX_devmap PTEs therefore the check in __gup_device_huge() is redundant. Remove it Signed-off-by: Alistair Popple Reviewed-by: Jason Gunthorpe Reviewed-by: Dan Wiliams Acked-by: David Hildenbrand --- mm/gup.c | 5 - 1 file changed, 5 deletions

[PATCH v4 08/25] fs/dax: Remove PAGE_MAPPING_DAX_SHARED mapping flag

2024-12-16 Thread Alistair Popple
ed. The page is considered shared when page->mapping == NULL and page->share > 0 or page->mapping != NULL, implying it is present in at least one address space. This also makes it easier for a future change to detect when a page is first mapped into an address space which requires spe

[PATCH v4 06/25] fs/dax: Always remove DAX page-cache entries when breaking layouts

2024-12-16 Thread Alistair Popple
when the file-system calls dax_break_mapping() as part of it's truncate operation. This ensures only idle pages can be removed from the FS DAX page-cache and makes it easy to detect if a file-system hasn't called dax_break_mapping() prior to a truncate operation. Signed-off-by:

[PATCH v4 05/25] fs/dax: Create a common implementation to break DAX layouts

2024-12-16 Thread Alistair Popple
: Alistair Popple --- Changes for v4: - Fixed some build breakage due to missing symbol exports reported by John Hubbard (thanks!). --- fs/dax.c| 30 ++ fs/ext4/inode.c | 10 +- fs/fuse/dax.c | 29 + fs/xfs

  1   2   >