Re: [PATCH v9 12/26] powerpc/powernv/ioda1: M64 support on P7IOC

2016-05-03 Thread Alistair Popple
On Tue, 3 May 2016 15:41:31 Gavin Shan wrote: > This enables M64 window on P7IOC, which has been enabled on PHB3. Have we tested that this works with an adaptor? This looks to be enabling support for something that didn't previously work (64-bit BARs on P7IOC)? Regards, Alistair > Different fr

Re: [PATCH v9 12/26] powerpc/powernv/ioda1: M64 support on P7IOC

2016-05-04 Thread Alistair Popple
On Wed, 4 May 2016 16:48:53 Gavin Shan wrote: > On Wed, May 04, 2016 at 03:17:51PM +1000, Alistair Popple wrote: > >On Tue, 3 May 2016 15:41:31 Gavin Shan wrote: > >> This enables M64 window on P7IOC, which has been enabled on PHB3. > > > >Have we tested that this wor

Re: [PATCH v9 12/26] powerpc/powernv/ioda1: M64 support on P7IOC

2016-05-04 Thread Alistair Popple
Thanks for the clarifications Gavin. Aside from the WARN_ON() (which is not a major thing) everything looks good. Reviewed-By: Alistair Popple On Thu, 5 May 2016 10:40:33 Gavin Shan wrote: > On Thu, May 05, 2016 at 09:53:51AM +1000, Alistair Popple wrote: > >On Wed, 4 May 2016 16:48

Re: [PATCH kernel v4 11/11] powerpc/powernv/npu: Enable NVLink pass through

2016-05-05 Thread Alistair Popple
On Thu, 5 May 2016 15:49:18 Alexey Kardashevskiy wrote: > On 05/04/2016 12:08 AM, Alistair Popple wrote: > > Hi Alexey, > > > > On Fri, 29 Apr 2016 18:55:24 Alexey Kardashevskiy wrote: > >> IBM POWER8 NVlink systems come with Tesla K40-ish GPUs each of which > &

Re: [PATCH kernel v4 10/11] powerpc/powernv/npu: Rework TCE Kill handling

2016-05-05 Thread Alistair Popple
On Thu, 5 May 2016 14:23:20 Alexey Kardashevskiy wrote: > On 05/03/2016 05:37 PM, Alistair Popple wrote: > > On Fri, 29 Apr 2016 18:55:23 Alexey Kardashevskiy wrote: > >> The pnv_ioda_pe struct keeps an array of peers. At the moment it is only > >> used to link

Re: [PATCH v9 15/22] powerpc/powernv: Functions to get/set PCI slot state

2016-05-10 Thread Alistair Popple
Gavin, On Tue, 3 May 2016 23:22:46 Gavin Shan wrote: > This exports 4 functions, which base on the corresponding OPAL > APIs to get/set PCI slot status. Those functions are going to > be used by PowerNV PCI hotplug driver: > >pnv_pci_get_device_tree()opal_get_device_tree() >pnv_pci_ge

[PATCH] powerpc/powernv: Initialise nest mmu

2016-08-14 Thread Alistair Popple
address of the partition table (ie. the PTCR) which needs to be programmed into the NMMU. This patch adds a call to OPAL to set the PTCR for the nest mmu in opal_init(). Signed-off-by: Alistair Popple --- This patch depends on a new OPAL call which has yet to be added to skiboot, although the

Re: [PATCH] powerpc/powernv: Initialise nest mmu

2016-08-15 Thread Alistair Popple
Balbir, > > + /* Update partition table control register on all Nest MMUs */ > > + opal_nmmu_set_ptcr(-1UL, __pa(partition_tb) | (PATB_SIZE_SHIFT - 12)); > > + > > Just wondering if > > 1. Instead of using -1 for all cpus, we should do > for_each_online_cpu() { > opal_

Re: [PATCH] powerpc/powernv: Initialise nest mmu

2016-09-13 Thread Alistair Popple
On Mon, 15 Aug 2016 04:51:59 PM Alistair Popple wrote: > POWER9 contains an off core mmu called the nest mmu (NMMU). This is > used by other hardware units on the chip to translate virtual > addresses into real addresses. The unit attempting an address > translation provides the maj

[PATCH] powerpc/powernv/pci: Return failure for some uses of dma_set_mask()

2017-07-25 Thread Alistair Popple
as documented in DMA-API-HOWTO.txt. Signed-off-by: Alistair Popple --- Ideally we should do the same thing for 64-bit mode as well however there are a lot more drivers requesting a dma mask of 64-bits so it's a much larger task to audit them all to see if they behave correctly when dma_set_

[PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb

2017-08-10 Thread Alistair Popple
. Signed-off-by: Alistair Popple --- Michael, This patch depends on http://patchwork.ozlabs.org/patch/796775/ - [v3,1/3] powerpc/mm: Add marker for contexts requiring global TLB invalidations. - Alistair arch/powerpc/platforms/powernv/npu-dma.c | 27 +-- arch/powerpc/platforms

[PATCH 1/2] powerpc/powernv/npu: Move tlb flush before launching ATSD

2017-08-10 Thread Alistair Popple
The nest mmu tlb flush needs to happen before the GPU translation shootdown is launched to avoid the GPU refilling its tlb with stale nmmu translations prior to the nmmu flush completing. Signed-off-by: Alistair Popple Cc: sta...@vger.kernel.org --- arch/powerpc/platforms/powernv/npu-dma.c | 12

Re: [PATCH v3 2/3] cxl: Mark context requiring global TLBIs

2017-08-21 Thread Alistair Popple
On Thu, 3 Aug 2017 05:22:31 PM Balbir Singh wrote: > On Thu, Aug 3, 2017 at 6:29 AM, Frederic Barrat > wrote: > > The PSL and XSL need to see all TLBIs pertinent to the memory contexts > > used on the adapter. For the hash memory model, it is done by making > > all TLBIs global as soon as the cxl

Re: [PATCH v3 1/3] powerpc/mm: Add marker for contexts requiring global TLB invalidations

2017-08-21 Thread Alistair Popple
For what it's worth this worked fine when testing with the NPU as well. Tested-by: Alistair Popple On Thu, 3 Aug 2017 05:16:35 PM Balbir Singh wrote: > On Thu, Aug 3, 2017 at 6:29 AM, Frederic Barrat > wrote: > > Introduce a new 'flags' attribute per context and def

[PATCH 1/2] powerpc/npu: Use flush_all_mm() instead of flush_tlb_mm()

2017-09-04 Thread Alistair Popple
With the optimisations introduced by commit a46cc7a908 ("powerpc/mm/radix: Improve TLB/PWC flushes"), flush_tlb_mm() no longer flushes the page walk cache with radix. Switch to using flush_all_mm() to ensure the pwc and tlb are properly flushed on the nmmu. Signed-off-by: Alist

[PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb

2017-09-04 Thread Alistair Popple
. Signed-off-by: Alistair Popple --- arch/powerpc/platforms/powernv/npu-dma.c | 28 +++- arch/powerpc/platforms/powernv/pci.h | 3 +++ 2 files changed, 26 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv

Re: [PATCH 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb

2017-09-05 Thread Alistair Popple
Hi, On Tue, 5 Sep 2017 10:10:03 AM Frederic Barrat wrote: > > > > + if (!nphb->npu.nmmu_flush) { > > + /* > > +* If we're not explicitly flushing ourselves we need to mark > > +* the thread for global flushes > > +*/ > > + npu_context->nmm

[PATCH v2 1/2] powerpc/npu: Use flush_all_mm() instead of flush_tlb_mm()

2017-09-05 Thread Alistair Popple
With the optimisations introduced by commit a46cc7a908 ("powerpc/mm/radix: Improve TLB/PWC flushes"), flush_tlb_mm() no longer flushes the page walk cache with radix. Switch to using flush_all_mm() to ensure the pwc and tlb are properly flushed on the nmmu. Signed-off-by: Alist

[PATCH v2 2/2] powerpc/powernv/npu: Don't explicitly flush nmmu tlb

2017-09-05 Thread Alistair Popple
. Signed-off-by: Alistair Popple --- Changes for v2: - Use mm_context_add_copro()/mm_context_remove_copro() instead of inc_mm_active_cpus()/dec_mm_active_cpus() arch/powerpc/platforms/powernv/npu-dma.c | 28 +++- arch/powerpc/platforms/powernv/pci.h | 3 +++ 2 files

Re: [PATCH v3 2/2] cxl: Enable global TLBIs for cxl contexts

2017-09-12 Thread Alistair Popple
On Fri, 8 Sep 2017 04:56:24 PM Nicholas Piggin wrote: > On Sun, 3 Sep 2017 20:15:13 +0200 > Frederic Barrat wrote: > > > The PSL and nMMU need to see all TLB invalidations for the memory > > contexts used on the adapter. For the hash memory model, it is done by > > making all TLBIs global as soo

Re: [PATCH v3 2/2] cxl: Enable global TLBIs for cxl contexts

2017-09-12 Thread Alistair Popple
I have tested the non-cxl specific parts (mm_context_add_copro/mm_context_remove_copro) with this series - https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=1681 - and it works well for npu. Tested-by: Alistair Popple On Sun, 3 Sep 2017 08:15:13 PM Frederic Barrat wrote: > The

Re: [PATCH v3 1/2] powerpc/mm: Export flush_all_mm()

2017-09-12 Thread Alistair Popple
ing we could do is flush the > + * entire LPID! Punt for now, as it's not being used. > + */ Do you think it is worth putting a WARN_ON_ONCE here if we're asserting this isn't used on hash? Otherwise looks good and is also needed for NPU. Reviewed-By: Alistair Popple

Re: [RFC PATCH 2/2] powerpc/powernv: implement NMI IPIs with OPAL_SIGNAL_SYSTEM_RESET

2017-09-13 Thread Alistair Popple
On Thu, 14 Sep 2017 04:32:28 PM Nicholas Piggin wrote: > On Thu, 14 Sep 2017 12:24:49 +1000 > Benjamin Herrenschmidt wrote: > > > On Wed, 2017-09-13 at 23:13 +1000, Nicholas Piggin wrote: > > > On Wed, 13 Sep 2017 02:05:53 +1000 > > > Nicholas Piggin wrote: > > > > > > > There are two complic

[PATCH 00/13] fs/dax: Fix FS DAX page reference counts

2024-06-26 Thread Alistair Popple
se patches. I am not intimately familiar with the FS DAX code so would appreciate some careful review there. In particular I have not given any thought at all to CONFIG_FS_DAX_LIMITED. Signed-off-by: Alistair Popple Alistair Popple (13): mm/gup.c: Remove redundant check for PCI P2PDMA page

[PATCH 01/13] mm/gup.c: Remove redundant check for PCI P2PDMA page

2024-06-26 Thread Alistair Popple
PCI P2PDMA pages are not mapped with pXX_devmap PTEs therefore the check in __gup_device_huge() is redundant. Remove it Signed-off-by: Alistair Popple Reviewed-by: Jason Gunthorpe Acked-by: David Hildenbrand --- mm/gup.c | 5 - 1 file changed, 5 deletions(-) diff --git a/mm/gup.c b/mm

[PATCH 02/13] pci/p2pdma: Don't initialise page refcount to one

2024-06-26 Thread Alistair Popple
MEMORY_DEVICE_PCI_P2PDMA pages so fix that up. Signed-off-by: Alistair Popple --- drivers/pci/p2pdma.c | 2 ++ mm/memremap.c| 8 mm/mm_init.c | 4 +++- 3 files changed, 9 insertions(+), 5 deletions(-) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 4f47a13..1e9ea32 100644 --- a

[PATCH 03/13] fs/dax: Refactor wait for dax idle page

2024-06-26 Thread Alistair Popple
A FS DAX page is considered idle when its refcount drops to one. This is currently open-coded in all file systems supporting FS DAX. Move the idle detection to a common function to make future changes easier. Signed-off-by: Alistair Popple Reviewed-by: Jan Kara --- fs/ext4/inode.c | 5

[PATCH 04/13] fs/dax: Add dax_page_free callback

2024-06-26 Thread Alistair Popple
happen when the page refcount drops to zero. In this case we can use the existing pgmap->ops->page_free() callback so wire that up for all devices that support FS DAX (nvdimm and virtio). Signed-off-by: Alistair Popple --- drivers/nvdimm/pmem.c | 1 + fs/dax.c | 6 ++ f

[PATCH 05/13] mm: Allow compound zone device pages

2024-06-26 Thread Alistair Popple
igned-off-by: Alistair Popple Reviewed-by: Jason Gunthorpe --- In response to the RFC Matthew Wilcox pointed out that we could move the pgmap field to the folio. Morally I think that's where pgmap belongs, so I it's a good idea that I just haven't had a change to implement yet. I sus

[PATCH 06/13] mm/memory: Add dax_insert_pfn

2024-06-26 Thread Alistair Popple
: Alistair Popple --- include/linux/mm.h | 4 ++- mm/memory.c| 79 ++- 2 files changed, 76 insertions(+), 7 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 9a5652c..b84368b 100644 --- a/include/linux/mm.h +++ b/include/linux

[PATCH 07/13] huge_memory: Allow mappings of PUD sized pages

2024-06-26 Thread Alistair Popple
current mechanism, vmf_insert_pfn_pud, which simply inserts a special devmap PUD entry into the page table without holding a reference to the page for the mapping. Signed-off-by: Alistair Popple --- include/linux/huge_mm.h | 4 ++- include/linux/rmap.h| 14 +- mm/huge_memory.c

[PATCH 08/13] huge_memory: Allow mappings of PMD sized pages

2024-06-26 Thread Alistair Popple
current mechanism, vmf_insert_pfn_pmd, which simply inserts a special devmap PMD entry into the page table without holding a reference to the page for the mapping. Signed-off-by: Alistair Popple --- include/linux/huge_mm.h | 1 +- mm/huge_memory.c| 70

[PATCH 09/13] gup: Don't allow FOLL_LONGTERM pinning of FS DAX pages

2024-06-26 Thread Alistair Popple
Longterm pinning of FS DAX pages should already be disallowed by various pXX_devmap checks. However a future change will cause these checks to be invalid for FS DAX pages so make folio_is_longterm_pinnable() return false for FS DAX pages. Signed-off-by: Alistair Popple --- include/linux

[PATCH 10/13] fs/dax: Properly refcount fs dax pages

2024-06-26 Thread Alistair Popple
to remove the pgmap refcounting that is currently done in mm/gup.c. Signed-off-by: Alistair Popple --- drivers/dax/device.c | 12 +- drivers/dax/super.c| 2 +- drivers/nvdimm/pmem.c | 8 +-- fs/dax.c | 193 +- fs/fuse

[PATCH 11/13] huge_memory: Remove dead vmf_insert_pXd code

2024-06-26 Thread Alistair Popple
Now that DAX is managing page reference counts the same as normal pages there are no callers for vmf_insert_pXd functions so remove them. Signed-off-by: Alistair Popple --- include/linux/huge_mm.h | 2 +- mm/huge_memory.c| 165 +- 2 files

[PATCH 12/13] mm: Remove pXX_devmap callers

2024-06-26 Thread Alistair Popple
->lru with page->pgmap. Signed-off-by: Alistair Popple --- arch/powerpc/mm/book3s64/hash_pgtable.c | 3 +- arch/powerpc/mm/book3s64/pgtable.c | 8 +- arch/powerpc/mm/book3s64/radix_pgtable.c | 5 +- arch/powerpc/mm/pgtable.c|

[PATCH 13/13] mm: Remove devmap related functions and page table bits

2024-06-26 Thread Alistair Popple
Now that DAX and all other reference counts to ZONE_DEVICE pages are managed normally there is no need for the special devmap PTE/PMD/PUD page table bits. So drop all references to these, freeing up a software defined page table bit on architectures supporting it. Signed-off-by: Alistair Popple

Re: [PATCH 00/13] fs/dax: Fix FS DAX page reference counts

2024-06-27 Thread Alistair Popple
Dan Williams writes: > Alistair Popple wrote: >> FS DAX pages have always maintained their own page reference counts >> without following the normal rules for page reference counting. In >> particular pages are considered free when the refcount hits one rather >>

Re: [PATCH 04/13] fs/dax: Add dax_page_free callback

2024-06-27 Thread Alistair Popple
Christoph Hellwig writes: > On Thu, Jun 27, 2024 at 10:54:19AM +1000, Alistair Popple wrote: >> When a fs dax page is freed it has to notify filesystems that the page >> has been unpinned/unmapped and is free. Currently this involves >> special code in the page f

Re: [PATCH 00/13] fs/dax: Fix FS DAX page reference counts

2024-06-27 Thread Alistair Popple
Dan Williams writes: > Alistair Popple wrote: >> >> Dan Williams writes: >> >> > Alistair Popple wrote: >> >> FS DAX pages have always maintained their own page reference counts >> >> without following the normal rules for page refere

Re: [PATCH 00/13] fs/dax: Fix FS DAX page reference counts

2024-07-01 Thread Alistair Popple
Dave Chinner writes: > On Thu, Jun 27, 2024 at 10:54:15AM +1000, Alistair Popple wrote: >> FS DAX pages have always maintained their own page reference counts >> without following the normal rules for page reference counting. In >> particular pages are considered free w

Re: [PATCH 09/13] gup: Don't allow FOLL_LONGTERM pinning of FS DAX pages

2024-07-01 Thread Alistair Popple
David Hildenbrand writes: > On 27.06.24 02:54, Alistair Popple wrote: >> Longterm pinning of FS DAX pages should already be disallowed by >> various pXX_devmap checks. However a future change will cause these >> checks to be invalid for FS DAX pages so make >>

Re: [PATCH 07/13] huge_memory: Allow mappings of PUD sized pages

2024-07-02 Thread Alistair Popple
David Hildenbrand writes: > On 27.06.24 02:54, Alistair Popple wrote: >> Currently DAX folio/page reference counts are managed differently to >> normal pages. To allow these to be managed the same as normal pages >> introduce dax_insert_pfn_pud. This will map the entire P

Re: [PATCH 06/13] mm/memory: Add dax_insert_pfn

2024-07-02 Thread Alistair Popple
David Hildenbrand writes: > On 27.06.24 02:54, Alistair Popple wrote: >> Currently to map a DAX page the DAX driver calls vmf_insert_pfn. This >> creates a special devmap PTE entry for the pfn but does not take a >> reference on the underlying struct page for the mappin

Re: [PATCH 07/13] huge_memory: Allow mappings of PUD sized pages

2024-07-02 Thread Alistair Popple
David Hildenbrand writes: > On 02.07.24 12:19, Alistair Popple wrote: >> David Hildenbrand writes: >> >>> On 27.06.24 02:54, Alistair Popple wrote: >>>> Currently DAX folio/page reference counts are managed differently to >>>> normal pages.

Re: [PATCH 11/13] huge_memory: Remove dead vmf_insert_pXd code

2024-07-08 Thread Alistair Popple
Peter Xu writes: > Hi, Alistair, > > On Thu, Jun 27, 2024 at 10:54:26AM +1000, Alistair Popple wrote: >> Now that DAX is managing page reference counts the same as normal >> pages there are no callers for vmf_insert_pXd functions so remove >> them. >>

Re: [PATCH 11/13] huge_memory: Remove dead vmf_insert_pXd code

2024-07-11 Thread Alistair Popple
Peter Xu writes: > On Tue, Jul 09, 2024 at 02:07:31PM +1000, Alistair Popple wrote: >> >> Peter Xu writes: >> >> > Hi, Alistair, >> > >> > On Thu, Jun 27, 2024 at 10:54:26AM +1000, Alistair Popple wrote: >> >> Now that DAX is m

Re: [PATCH 10/13] fs/dax: Properly refcount fs dax pages

2024-09-05 Thread Alistair Popple
Christoph Hellwig writes: >> diff --git a/drivers/dax/device.c b/drivers/dax/device.c >> index eb61598..b7a31ae 100644 >> --- a/drivers/dax/device.c >> +++ b/drivers/dax/device.c >> @@ -126,11 +126,11 @@ static vm_fault_t __dev_dax_pte_fault(struct dev_dax >> *dev_dax, >> return V

Re: [PATCH 06/13] mm/memory: Add dax_insert_pfn

2024-09-05 Thread Alistair Popple
Jan Kara writes: > On Thu 27-06-24 10:54:21, Alistair Popple wrote: >> Currently to map a DAX page the DAX driver calls vmf_insert_pfn. This >> creates a special devmap PTE entry for the pfn but does not take a >> reference on the underlying struct page for the mapping. T

[PATCH 00/12] fs/dax: Fix FS DAX page reference counts

2024-09-09 Thread Alistair Popple
ws further clean-up of the devmap managed functions, but I have left that as a future improvment. I am not intimately familiar with the FS DAX code so would appreciate some careful review there. In particular I have not given any thought at all to CONFIG_FS_DAX_LIMITED. Signed-off-by: Alist

[PATCH 01/12] mm/gup.c: Remove redundant check for PCI P2PDMA page

2024-09-09 Thread Alistair Popple
PCI P2PDMA pages are not mapped with pXX_devmap PTEs therefore the check in __gup_device_huge() is redundant. Remove it Signed-off-by: Alistair Popple Reviewed-by: Jason Gunthorpe Acked-by: David Hildenbrand --- mm/gup.c | 5 - 1 file changed, 5 deletions(-) diff --git a/mm/gup.c b/mm

[PATCH 02/12] pci/p2pdma: Don't initialise page refcount to one

2024-09-09 Thread Alistair Popple
MEMORY_DEVICE_PCI_P2PDMA pages so fix that up. Signed-off-by: Alistair Popple --- drivers/pci/p2pdma.c | 6 ++ mm/memremap.c| 17 + mm/mm_init.c | 22 ++ 3 files changed, 37 insertions(+), 8 deletions(-) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c

[PATCH 03/12] fs/dax: Refactor wait for dax idle page

2024-09-09 Thread Alistair Popple
A FS DAX page is considered idle when its refcount drops to one. This is currently open-coded in all file systems supporting FS DAX. Move the idle detection to a common function to make future changes easier. Signed-off-by: Alistair Popple Reviewed-by: Jan Kara Reviewed-by: Christoph Hellwig

[PATCH 04/12] mm: Allow compound zone device pages

2024-09-09 Thread Alistair Popple
>pgmap. The page->pgmap field is common to all pages within a memory section. Therefore pgmap is the same for both head and tail pages and can be moved into the folio and we can use the standard scheme to find compound_head from a tail page. Signed-off-by: Alistair Popple Reviewed-by: Jas

[PATCH 05/12] mm/memory: Add dax_insert_pfn

2024-09-09 Thread Alistair Popple
: Alistair Popple --- Updates from v1: - Re-arrange code in insert_page_into_pte_locked() based on comments from Jan Kara. - Call mkdrity/mkyoung for the mkwrite case, also suggested by Jan. --- include/linux/mm.h | 1 +- mm/memory.c| 83

[PATCH 06/12] huge_memory: Allow mappings of PUD sized pages

2024-09-09 Thread Alistair Popple
current mechanism, vmf_insert_pfn_pud, which simply inserts a special devmap PUD entry into the page table without holding a reference to the page for the mapping. Signed-off-by: Alistair Popple --- include/linux/huge_mm.h | 4 ++- include/linux/rmap.h| 15 +++- mm/huge_memory.c| 93

[PATCH 07/12] huge_memory: Allow mappings of PMD sized pages

2024-09-09 Thread Alistair Popple
current mechanism, vmf_insert_pfn_pmd, which simply inserts a special devmap PMD entry into the page table without holding a reference to the page for the mapping. Signed-off-by: Alistair Popple --- include/linux/huge_mm.h | 1 +- mm/huge_memory.c| 57

[PATCH 08/12] gup: Don't allow FOLL_LONGTERM pinning of FS DAX pages

2024-09-09 Thread Alistair Popple
Longterm pinning of FS DAX pages should already be disallowed by various pXX_devmap checks. However a future change will cause these checks to be invalid for FS DAX pages so make folio_is_longterm_pinnable() return false for FS DAX pages. Signed-off-by: Alistair Popple --- include/linux

[PATCH 10/12] fs/dax: Properly refcount fs dax pages

2024-09-09 Thread Alistair Popple
to remove the pgmap refcounting that is currently done in mm/gup.c. Signed-off-by: Alistair Popple --- drivers/dax/device.c | 12 +- drivers/dax/super.c| 2 +- drivers/nvdimm/pmem.c | 4 +- fs/dax.c | 192 ++ fs/fuse

[PATCH 11/12] mm: Remove pXX_devmap callers

2024-09-09 Thread Alistair Popple
->lru with page->pgmap. Signed-off-by: Alistair Popple --- arch/powerpc/mm/book3s64/hash_pgtable.c | 3 +- arch/powerpc/mm/book3s64/pgtable.c | 8 +- arch/powerpc/mm/book3s64/radix_pgtable.c | 5 +- arch/powerpc/mm/pgtable.c|

[PATCH 12/12] mm: Remove devmap related functions and page table bits

2024-09-09 Thread Alistair Popple
Now that DAX and all other reference counts to ZONE_DEVICE pages are managed normally there is no need for the special devmap PTE/PMD/PUD page table bits. So drop all references to these, freeing up a software defined page table bit on architectures supporting it. Signed-off-by: Alistair Popple

[PATCH 09/12] mm: Update vm_normal_page() callers to accept FS DAX pages

2024-09-09 Thread Alistair Popple
ned-off-by: Alistair Popple --- arch/x86/mm/pat/memtype.c | 4 +++- fs/proc/task_mmu.c| 16 mm/memcontrol-v1.c| 2 +- 3 files changed, 16 insertions(+), 6 deletions(-) diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c index 1fa0bf6..eb84593 10

Re: [PATCH 04/12] mm: Allow compound zone device pages

2024-09-10 Thread Alistair Popple
Matthew Wilcox writes: > On Tue, Sep 10, 2024 at 02:14:29PM +1000, Alistair Popple wrote: >> @@ -337,6 +341,7 @@ struct folio { >> /* private: */ >> }; >> /* public: */ >> +struct dev_pagemap *pgmap;

Re: [PATCH 02/12] pci/p2pdma: Don't initialise page refcount to one

2024-09-10 Thread Alistair Popple
>> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c >> index 4f47a13..210b9f4 100644 >> --- a/drivers/pci/p2pdma.c >> +++ b/drivers/pci/p2pdma.c >> @@ -129,6 +129,12 @@ static int p2pmem_alloc_mmap(struct file *filp, struct >> kobject *kobj, >> } >> >> /* >> + * Initialis

[PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page

2022-08-16 Thread Alistair Popple
uently accessed. Prevent this by copying the dirty bit to the page when removing the pte to match what try_to_migrate_one() does. Signed-off-by: Alistair Popple Acked-by: Peter Xu Reported-by: Huang Ying Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while collec

[PATCH v2 2/2] selftests/hmm-tests: Add test for dirty bits

2022-08-16 Thread Alistair Popple
We were not correctly copying PTE dirty bits to pages during migrate_vma_setup() calls. This could potentially lead to data loss, so add a test for this. Signed-off-by: Alistair Popple --- tools/testing/selftests/vm/hmm-tests.c | 124 ++- 1 file changed, 124 insertions

Re: [PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page

2022-08-16 Thread Alistair Popple
huang ying writes: > On Tue, Aug 16, 2022 at 3:39 PM Alistair Popple wrote: >> >> migrate_vma_setup() has a fast path in migrate_vma_collect_pmd() that >> installs migration entries directly if it can lock the migrating page. >> When removing a dirty pte the

Re: [PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page

2022-08-16 Thread Alistair Popple
Peter Xu writes: > On Tue, Aug 16, 2022 at 04:10:29PM +0800, huang ying wrote: >> > @@ -193,11 +194,10 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, >> > bool anon_exclusive; >> > pte_t swp_pte; >> > >> > + flush_cache_p

Re: [PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page

2022-08-16 Thread Alistair Popple
Peter Xu writes: > On Wed, Aug 17, 2022 at 11:49:03AM +1000, Alistair Popple wrote: >> >> Peter Xu writes: >> >> > On Tue, Aug 16, 2022 at 04:10:29PM +0800, huang ying wrote: >> >> > @@ -193,11 +194,10 @@ static int migrate_vma_collect_pmd(pmd_

Re: [PATCH 09/12] mm: Update vm_normal_page() callers to accept FS DAX pages

2024-10-14 Thread Alistair Popple
Dan Williams writes: > Alistair Popple wrote: >> Currently if a PTE points to a FS DAX page vm_normal_page() will >> return NULL as these have their own special refcounting scheme. A >> future change will allow FS DAX pages to be refcounted the same as any

Re: [PATCH 02/12] pci/p2pdma: Don't initialise page refcount to one

2024-10-10 Thread Alistair Popple
Dan Williams writes: > Alistair Popple wrote: [...] >> diff --git a/mm/memremap.c b/mm/memremap.c >> index 40d4547..07bbe0e 100644 >> --- a/mm/memremap.c >> +++ b/mm/memremap.c >> @@ -488,15 +488,24 @@ void free_zone_device_folio(struct folio *fol

Re: [PATCH 02/12] pci/p2pdma: Don't initialise page refcount to one

2024-10-10 Thread Alistair Popple
Logan Gunthorpe writes: > On 2024-09-09 22:14, Alistair Popple wrote: >> The reference counts for ZONE_DEVICE private pages should be >> initialised by the driver when the page is actually allocated by the >> driver allocator, not when they are first created. This is curr

Re: [PATCH 06/12] huge_memory: Allow mappings of PUD sized pages

2024-10-13 Thread Alistair Popple
Dan Williams writes: > Alistair Popple wrote: >> Currently DAX folio/page reference counts are managed differently to >> normal pages. To allow these to be managed the same as normal pages >> introduce dax_insert_pfn_pud. This will map the entire PUD-sized folio >>

Re: [PATCH 08/12] gup: Don't allow FOLL_LONGTERM pinning of FS DAX pages

2024-10-14 Thread Alistair Popple
Dan Williams writes: > Dan Williams wrote: >> Alistair Popple wrote: >> > Longterm pinning of FS DAX pages should already be disallowed by >> > various pXX_devmap checks. However a future change will cause these >> > checks to be invalid for FS DAX pages so

Re: [PATCH 11/12] mm: Remove pXX_devmap callers

2024-10-14 Thread Alistair Popple
Alexander Gordeev writes: > On Tue, Sep 10, 2024 at 02:14:36PM +1000, Alistair Popple wrote: > > Hi Alistair, > >> diff --git a/arch/powerpc/mm/book3s64/pgtable.c >> b/arch/powerpc/mm/book3s64/pgtable.c >> index 5a4a753..4537a29 100644 >> --- a/arch/powerpc

Re: [PATCH 07/12] huge_memory: Allow mappings of PMD sized pages

2024-10-14 Thread Alistair Popple
Dan Williams writes: > Alistair Popple wrote: >> Currently DAX folio/page reference counts are managed differently to >> normal pages. To allow these to be managed the same as normal pages >> introduce dax_insert_pfn_pmd. This will map the entire PMD-sized folio >>

Re: [PATCH 10/12] fs/dax: Properly refcount fs dax pages

2024-10-29 Thread Alistair Popple
Dan Williams writes: > Alistair Popple wrote: > [..] > >> >> > It follows that that the DMA-idle condition still needs to look for the >> >> > case where the refcount is > 1 rather than 0 since refcount == 1 is the >> >> > page-mapp

Re: [PATCH 10/12] fs/dax: Properly refcount fs dax pages

2024-10-27 Thread Alistair Popple
Dan Williams writes: > Alistair Popple wrote: > [..] >>> I'm not really following this scenario, or at least how it relates to >> >> the comment above. If the page is pinned for DMA it will have taken a >> >> refcount on it and s

Re: [PATCH 07/12] huge_memory: Allow mappings of PMD sized pages

2024-10-23 Thread Alistair Popple
Alistair Popple writes: > Alistair Popple wrote: >> Dan Williams writes: [...] >>> + >>> + return VM_FAULT_NOPAGE; >>> +} >>> +EXPORT_SYMBOL_GPL(dax_insert_pfn_pmd); >> >> Like I mentioned before, lets make the exported function

Re: [PATCH 10/12] fs/dax: Properly refcount fs dax pages

2024-10-24 Thread Alistair Popple
Dan Williams writes: > Alistair Popple wrote: > [..] >> > >> > Was there a discussion I missed about why the conversion to typical >> > folios allows the page->share accounting to be dropped. >> >> The problem with keeping it is we now t

Re: [PATCH 10/12] fs/dax: Properly refcount fs dax pages

2024-10-24 Thread Alistair Popple
Dan Williams writes: > Alistair Popple wrote: [...] >> @@ -318,85 +323,58 @@ static unsigned long dax_end_pfn(void *entry) >> */ >> #define for_each_mapped_pfn(entry, pfn) \ >> for (pfn = dax_to_pfn(entry); \ >> -

Re: [PATCH v3 10/25] pci/p2pdma: Don't initialise page refcount to one

2024-11-24 Thread Alistair Popple
Bjorn Helgaas writes: > On Fri, Nov 22, 2024 at 12:40:31PM +1100, Alistair Popple wrote: >> The reference counts for ZONE_DEVICE private pages should be >> initialised by the driver when the page is actually allocated by the >> driver allocator, not when they are f

Re: [PATCH v4 12/25] mm/memory: Enhance insert_page_into_pte_locked() to create writable mappings

2025-01-05 Thread Alistair Popple
On Fri, Dec 20, 2024 at 08:06:48PM +0100, David Hildenbrand wrote: > On 20.12.24 20:01, David Hildenbrand wrote: > > On 17.12.24 06:12, Alistair Popple wrote: > > > In preparation for using insert_page() for DAX, enhance > > > insert_page_into_pte_locked() to h

Re: [PATCH v4 19/25] proc/task_mmu: Ignore ZONE_DEVICE pages

2025-01-05 Thread Alistair Popple
On Fri, Dec 20, 2024 at 07:32:52PM +0100, David Hildenbrand wrote: > On 19.12.24 00:11, Alistair Popple wrote: > > On Tue, Dec 17, 2024 at 11:31:25PM +0100, David Hildenbrand wrote: > > > On 17.12.24 06:13, Alistair Popple wrote: > > > > The procfs mmu files such as

[PATCH v5 10/25] mm/mm_init: Move p2pdma page refcount initialisation to p2pdma

2025-01-06 Thread Alistair Popple
refcount as required. P2PDMA uses vm_insert_page() to map the page, and that requires a non-zero reference count when initialising the page so set that when the page is first mapped. Signed-off-by: Alistair Popple Reviewed-by: Dan Williams --- Changes since v2: - Initialise the page refcount

[PATCH v5 08/25] fs/dax: Remove PAGE_MAPPING_DAX_SHARED mapping flag

2025-01-06 Thread Alistair Popple
ed. The page is considered shared when page->mapping == NULL and page->share > 0 or page->mapping != NULL, implying it is present in at least one address space. This also makes it easier for a future change to detect when a page is first mapped into an address space which requires spe

[PATCH v5 11/25] mm: Allow compound zone device pages

2025-01-06 Thread Alistair Popple
>pgmap. The page->pgmap field is common to all pages within a memory section. Therefore pgmap is the same for both head and tail pages and can be moved into the folio and we can use the standard scheme to find compound_head from a tail page. Signed-off-by: Alistair Popple Reviewed-by: Jas

[PATCH v5 05/25] fs/dax: Create a common implementation to break DAX layouts

2025-01-06 Thread Alistair Popple
: Alistair Popple --- Changes for v5: - Don't wait for idle pages on non-DAX mappings Changes for v4: - Fixed some build breakage due to missing symbol exports reported by John Hubbard (thanks!). --- fs/dax.c| 33 + fs/ext4/inode.c

[PATCH v5 04/25] fs/dax: Refactor wait for dax idle page

2025-01-06 Thread Alistair Popple
A FS DAX page is considered idle when its refcount drops to one. This is currently open-coded in all file systems supporting FS DAX. Move the idle detection to a common function to make future changes easier. Signed-off-by: Alistair Popple Reviewed-by: Jan Kara Reviewed-by: Christoph Hellwig

[PATCH v5 12/25] mm/memory: Enhance insert_page_into_pte_locked() to create writable mappings

2025-01-06 Thread Alistair Popple
In preparation for using insert_page() for DAX, enhance insert_page_into_pte_locked() to handle establishing writable mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a PTE which bypasses the typical set_pte_range() in finish_fault. Signed-off-by: Alistair Popple Suggested-by

[PATCH v5 00/25] fs/dax: Fix ZONE_DEVICE page reference counts

2025-01-06 Thread Alistair Popple
p of the devmap managed functions, but I have left that as a future improvment. It also enables support for compound ZONE_DEVICE pages which is one of my primary motivators for doing this work. Signed-off-by: Alistair Popple --- Cc: l...@asahilina.net Cc: zhang.l...@gmail.com Cc: gerald.schae...@l

[PATCH v5 13/25] mm/memory: Add vmf_insert_page_mkwrite()

2025-01-06 Thread Alistair Popple
-off-by: Alistair Popple --- Updates from v2: - Rename function to make not DAX specific - Split the insert_page_into_pte_locked() change into a separate patch. Updates from v1: - Re-arrange code in insert_page_into_pte_locked() based on comments from Jan Kara. - Call mkdrity/mkyoung

[PATCH v5 03/25] fs/dax: Don't skip locked entries when scanning entries

2025-01-06 Thread Alistair Popple
to make it clear that it may advance the iterator state. Signed-off-by: Alistair Popple --- fs/dax.c | 50 +- 1 file changed, 41 insertions(+), 9 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 5133568..d010c10 100644 --- a/fs/dax.c +++ b/

[PATCH v5 02/25] fs/dax: Return unmapped busy pages from dax_layout_busy_page_range()

2025-01-06 Thread Alistair Popple
user-space with mapping_mapped() and returns early if not, skipping the check for DMA busy pages. This is wrong as pages may still be undergoing DMA access even if they have subsequently been unmapped from user-space. Fix this by dropping the check for mapping_mapped(). Signed-off-by: Alistair

[PATCH v5 07/25] fs/dax: Ensure all pages are idle prior to filesystem unmount

2025-01-06 Thread Alistair Popple
ystem block to be freed will not wait for the remote access to complete. Therefore a busy block may be reallocated to a new file leading to corruption. Signed-off-by: Alistair Popple --- Changes for v5: - Don't wait for pages to be idle in non-DAX mappings --- fs/dax.c

[PATCH v5 06/25] fs/dax: Always remove DAX page-cache entries when breaking layouts

2025-01-06 Thread Alistair Popple
when the file-system calls dax_break_mapping() as part of it's truncate operation. This ensures only idle pages can be removed from the FS DAX page-cache and makes it easy to detect if a file-system hasn't called dax_break_mapping() prior to a truncate operation. Signed-off-by:

[PATCH v5 24/25] mm: Remove devmap related functions and page table bits

2025-01-06 Thread Alistair Popple
Now that DAX and all other reference counts to ZONE_DEVICE pages are managed normally there is no need for the special devmap PTE/PMD/PUD page table bits. So drop all references to these, freeing up a software defined page table bit on architectures supporting it. Signed-off-by: Alistair Popple

[PATCH v5 01/25] fuse: Fix dax truncate/punch_hole fault path

2025-01-06 Thread Alistair Popple
== 0 in fuse_dax_break_layouts() and pass the entire file range to dax_layout_busy_page_range(). Signed-off-by: Alistair Popple Fixes: 6ae330cad6ef ("virtiofs: serialize truncate/punch_hole and dax fault path") Cc: Vivek Goyal --- I am not at all familiar with the fuse file system d

[PATCH v5 09/25] mm/gup: Remove redundant check for PCI P2PDMA page

2025-01-06 Thread Alistair Popple
PCI P2PDMA pages are not mapped with pXX_devmap PTEs therefore the check in __gup_device_huge() is redundant. Remove it Signed-off-by: Alistair Popple Reviewed-by: Jason Gunthorpe Reviewed-by: Dan Wiliams Acked-by: David Hildenbrand --- mm/gup.c | 5 - 1 file changed, 5 deletions

[PATCH v5 14/25] rmap: Add support for PUD sized mappings to rmap

2025-01-06 Thread Alistair Popple
PUD-sized folios so we don't support for that for now. Signed-off-by: Alistair Popple --- Changes for v5: - Fixed accounting as suggested by David. Changes for v4: - New for v4, split out rmap changes as suggested by David. --- include/linux/rmap.h | 15 ++- mm/rmap.c

<    1   2   3   4   5   6   7   8   >