Re: [PATCH mm-unstable] mm/khugepaged: fix collapse_pte_mapped_thp() versus uffd

2023-08-22 Thread Hugh Dickins
On Tue, 22 Aug 2023, Matthew Wilcox wrote: > On Tue, Aug 22, 2023 at 11:34:19AM -0700, Hugh Dickins wrote: > > (Yes, the locking is a bit confusing: but mainly for the unrelated reason, > > that with the split locking configs, we never quite know whether this lock > > is the

Re: [PATCH mm-unstable] mm/khugepaged: fix collapse_pte_mapped_thp() versus uffd

2023-08-22 Thread Hugh Dickins
On Tue, 22 Aug 2023, Jann Horn wrote: > On Tue, Aug 22, 2023 at 4:51 AM Hugh Dickins wrote: > > On Mon, 21 Aug 2023, Jann Horn wrote: > > > On Mon, Aug 21, 2023 at 9:51 PM Hugh Dickins wrote: > > > > Just for this case, take the pmd_lock() two steps earlier: no

Re: [PATCH mm-unstable] mm/khugepaged: fix collapse_pte_mapped_thp() versus uffd

2023-08-21 Thread Hugh Dickins
On Mon, 21 Aug 2023, Jann Horn wrote: > On Mon, Aug 21, 2023 at 9:51 PM Hugh Dickins wrote: > > Jann Horn demonstrated how userfaultfd ioctl UFFDIO_COPY into a private > > shmem mapping can add valid PTEs to page table collapse_pte_mapped_thp() > > thought it had emptied:

[PATCH mm-unstable] mm/khugepaged: fix collapse_pte_mapped_thp() versus uffd

2023-08-21 Thread Hugh Dickins
z0FxiRC4d3VTu_a9h=rg5fw-kyd5rg5xo_rdbm0ltt...@mail.gmail.com/ Fixes: 1043173eb5eb ("mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()") Signed-off-by: Hugh Dickins --- mm/khugepaged.c | 38 +- 1 file changed, 29 insertions(+), 9 deletions(-)

Re: [BUG] Re: [PATCH v3 10/13] mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()

2023-08-21 Thread Hugh Dickins
On Mon, 14 Aug 2023, Jann Horn wrote: > On Wed, Jul 12, 2023 at 6:42 AM Hugh Dickins wrote: > > Bring collapse_and_free_pmd() back into collapse_pte_mapped_thp(). > > It does need mmap_read_lock(), but it does not need mmap_write_lock(), > > nor vma_start_write() nor i_mmap l

Re: [BUG] Re: [PATCH v3 10/13] mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()

2023-08-15 Thread Hugh Dickins
On Tue, 15 Aug 2023, David Hildenbrand wrote: > On 15.08.23 08:34, Hugh Dickins wrote: > > On Mon, 14 Aug 2023, Jann Horn wrote: > >> > >> /* step 4: remove page table */ > >> + if (strcmp(current->comm, "DELAYME") == 0) { > >

Re: [BUG] Re: [PATCH v3 10/13] mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()

2023-08-14 Thread Hugh Dickins
On Mon, 14 Aug 2023, Jann Horn wrote: > On Wed, Jul 12, 2023 at 6:42 AM Hugh Dickins wrote: > > Bring collapse_and_free_pmd() back into collapse_pte_mapped_thp(). > > It does need mmap_read_lock(), but it does not need mmap_write_lock(), > > nor vma_start_write() nor i_mmap l

[PATCH v3 10/13 fix2] mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock(): fix2

2023-08-05 Thread Hugh Dickins
Use ptep_clear() instead of pte_clear(): when CONFIG_PAGE_TABLE_CHECK=y, ptep_clear() adds some accounting, missing which would cause a BUG later. Signed-off-by: Hugh Dickins Reported-by: Qi Zheng Closes: https://lore.kernel.org/linux-mm/0df84f9f-e9b0-80b1-4c9e-95abc1a73...@bytedance.com

Re: [PATCH v3 10/13] mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()

2023-08-05 Thread Hugh Dickins
On Thu, 3 Aug 2023, Qi Zheng wrote: > On 2023/7/12 12:42, Hugh Dickins wrote: > > Bring collapse_and_free_pmd() back into collapse_pte_mapped_thp(). > > It does need mmap_read_lock(), but it does not need mmap_write_lock(), > > nor vma_start_write() nor i_mmap lock nor anon_

Re: [PATCH mm-unstable v7 00/31] Split ptdesc from struct page

2023-07-24 Thread Hugh Dickins
On Mon, 24 Jul 2023, Vishal Moola (Oracle) wrote: > The MM subsystem is trying to shrink struct page. This patchset > introduces a memory descriptor for page table tracking - struct ptdesc. > > This patchset introduces ptdesc, splits ptdesc from struct page, and > converts many callers of page ta

[PATCH v3 11/13 fix] mm/khugepaged: delete khugepaged_collapse_pte_mapped_thps(): fix

2023-07-23 Thread Hugh Dickins
Though not yet detected by syzbot, this commit was making the same mistake with mmap_locked as the previous commit: fix that. Signed-off-by: Hugh Dickins --- mm/khugepaged.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index

[PATCH v3 10/13 fix] mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock(): fix

2023-07-23 Thread Hugh Dickins
ler.appspotmail.com Closes: https://lore.kernel.org/linux-mm/e4b0f0060123c...@google.com/ Signed-off-by: Hugh Dickins --- mm/khugepaged.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 6bad69c0e4bd..1c773db26e88 100644 --- a/mm/khugepaged.

[PATCH v3 07/13 fix] s390: add pte_free_defer() for pgtables sharing page: fix

2023-07-23 Thread Hugh Dickins
Claudio finds warning on mm_has_pgste() more useful than on mm_alloc_pgste(). Signed-off-by: Hugh Dickins --- arch/s390/mm/pgalloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/s390/mm/pgalloc.c b/arch/s390/mm/pgalloc.c index 760b4ace475e..d7374add7820 100644 --- a

[PATCH v3 04/13 fix] powerpc: assert_pte_locked() use pte_offset_map_nolock(): fix

2023-07-23 Thread Hugh Dickins
assert_pte_locked(). BUG if pte_offset_map_nolock() fails. Signed-off-by: Hugh Dickins --- arch/powerpc/mm/pgtable.c | 8 1 file changed, 8 insertions(+) diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 16b061af86d7..a3dcdb2d5b4b 100644 --- a/arch/powerpc/mm/pgtable.c

Re: [PATCH v3 04/13] powerpc: assert_pte_locked() use pte_offset_map_nolock()

2023-07-18 Thread Hugh Dickins
On Tue, 18 Jul 2023, Aneesh Kumar K.V wrote: > Hugh Dickins writes: > > > Instead of pte_lockptr(), use the recently added pte_offset_map_nolock() > > in assert_pte_locked(). BUG if pte_offset_map_nolock() fails: this is > > stricter than the previous implementation, whi

[PATCH mm 12/13] mm: delete mmap_write_trylock() and vma_try_start_write()

2023-07-11 Thread Hugh Dickins
mmap_write_trylock() and vma_try_start_write() were added just for khugepaged, but now it has no use for them: delete. Signed-off-by: Hugh Dickins --- This is the version which applies to mm-unstable or linux-next. include/linux/mm.h| 17 - include/linux/mmap_lock.h

[PATCH v3 13/13] mm/pgtable: notes on pte_offset_map[_lock]()

2023-07-11 Thread Hugh Dickins
Add a block of comments on pte_offset_map_lock(), pte_offset_map() and pte_offset_map_nolock() to mm/pgtable-generic.c, to help explain them. Signed-off-by: Hugh Dickins --- mm/pgtable-generic.c | 44 1 file changed, 44 insertions(+) diff --git a/mm

[PATCH v3 12/13] mm: delete mmap_write_trylock() and vma_try_start_write()

2023-07-11 Thread Hugh Dickins
mmap_write_trylock() and vma_try_start_write() were added just for khugepaged, but now it has no use for them: delete. Signed-off-by: Hugh Dickins --- include/linux/mm.h| 17 - include/linux/mmap_lock.h | 10 -- 2 files changed, 27 deletions(-) diff --git a

[PATCH v3 11/13] mm/khugepaged: delete khugepaged_collapse_pte_mapped_thps()

2023-07-11 Thread Hugh Dickins
recollapsed. Call collapse_pte_mapped_thp() directly in this case (why was it deferred before? I assume an issue with needing mmap_lock for write, but now it's only needed for read). Signed-off-by: Hugh Dickins --- mm/khugepaged.c | 125 +++--- 1 file ch

[PATCH v3 10/13] mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()

2023-07-11 Thread Hugh Dickins
". But with those entries now cleared, "step 4" (after dropping ptl to do pmd_lock) is kept safe by the huge page lock, which stops new PTEs from being faulted in. Signed-off-by: Hugh Dickins --- mm/khugepaged.c | 172 ++ 1 file

[PATCH v3 09/13] mm/khugepaged: retract_page_tables() without mmap or vma lock

2023-07-11 Thread Hugh Dickins
nhanced to replace_page_tables(), which inserts the final huge pmd without mmap lock: going through an invalid state instead of pmd_none() followed by fault. But that enhancement does raise some more questions: leave it until a later release. Signed-off-by: Hugh Dickins --- mm/khugepag

[PATCH v3 08/13] mm/pgtable: add pte_free_defer() for pgtable as page

2023-07-11 Thread Hugh Dickins
table (none of whose pte_free()s use the mm arg which was passed to it). Signed-off-by: Hugh Dickins --- include/linux/mm_types.h | 4 include/linux/pgtable.h | 2 ++ mm/pgtable-generic.c | 20 3 files changed, 26 insertions(+) diff --git a/include/linux/mm_type

[PATCH v3 07/13] s390: add pte_free_defer() for pgtables sharing page

2023-07-11 Thread Hugh Dickins
_tlb_remove_table(), where we might not have a stable mm any more. Signed-off-by: Hugh Dickins Reviewed-by: Gerald Schaefer --- arch/s390/include/asm/pgalloc.h | 4 ++ arch/s390/mm/pgalloc.c | 80 +-- 2 files changed, 72 insertions(+), 12 deletions(-) diff

[PATCH v3 06/13] sparc: add pte_free_defer() for pte_t *pgtable_t

2023-07-11 Thread Hugh Dickins
ble_t. sparc32 supports pagetables sharing a page, but does not support THP; sparc64 supports THP, but does not support pagetables sharing a page. So the sparc-specific pte_free_defer() is as simple as the generic one, except for converting between pte_t *pgtable_t and struct page *. Signed-off-by:

[PATCH v3 05/13] powerpc: add pte_free_defer() for pgtables sharing page

2023-07-11 Thread Hugh Dickins
lling pte_fragment_free() directly; and there call_rcu() to pte_free_now() when last fragment is freed and the page is PageActive. Suggested-by: Jason Gunthorpe Signed-off-by: Hugh Dickins --- arch/powerpc/include/asm/pgalloc.h | 4 arch/powerpc/mm/pgtable-frag.c

[PATCH v3 04/13] powerpc: assert_pte_locked() use pte_offset_map_nolock()

2023-07-11 Thread Hugh Dickins
know, if an assert_pte_locked() caller can be racing such transitions? This mod might cause new crashes: which either expose my ignorance, or indicate issues to be fixed, or limit the usage of assert_pte_locked(). Signed-off-by: Hugh Dickins --- arch/powerpc/mm/pgtable.c | 16 ++

[PATCH v3 03/13] arm: adjust_pte() use pte_offset_map_nolock()

2023-07-11 Thread Hugh Dickins
-by: Hugh Dickins --- arch/arm/mm/fault-armv.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c index ca5302b0b7ee..7cb125497976 100644 --- a/arch/arm/mm/fault-armv.c +++ b/arch/arm/mm/fault-armv.c @@ -117,11 +117,10 @@ static

[PATCH v3 02/13] mm/pgtable: add PAE safety to __pte_offset_map()

2023-07-11 Thread Hugh Dickins
ssumptions). Signed-off-by: Hugh Dickins --- include/linux/pgtable.h | 4 mm/pgtable-generic.c| 29 + 2 files changed, 33 insertions(+) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 5134edcec668..7f2db400f653 100644 --- a/include/linux

[PATCH v3 01/13] mm/pgtable: add rcu_read_lock() and rcu_read_unlock()s

2023-07-11 Thread Hugh Dickins
-by: Hugh Dickins --- include/linux/pgtable.h | 4 ++-- mm/pgtable-generic.c| 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 5063b482e34f..5134edcec668 100644 --- a/include/linux/pgtable.h +++ b/include/linux

[PATCH v3 00/13] mm: free retracted page table by RCU

2023-07-11 Thread Hugh Dickins
Here is v3 of the series of patches to mm (and a few architectures), based on v6.5-rc1 which includes the preceding two series (thank you!): in which khugepaged takes advantage of pte_offset_map[_lock]() allowing for pmd transitions. Differences from v1 and v2 are noted patch by patch below. This

[PATCH] mm: lock newly mapped VMA with corrected ordering

2023-07-08 Thread Hugh Dickins
("mm: lock newly mapped VMA which can be modified after it becomes visible") Cc: sta...@vger.kernel.org Signed-off-by: Hugh Dickins --- mm/mmap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 84c71431a527..3eda23c9ebe7 100644 ---

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-07-06 Thread Hugh Dickins
On Thu, 6 Jul 2023, Gerald Schaefer wrote: > > Since none of my remarks on the comments seem valid or strictly necessary > any more, and I also could not find functional issues, I think you can add > this patch as new version for 07/12. And I can now give you this: > > Reviewed-by: Gerald Schaefe

Re: [PATCH] powerpc/mm/book3s64/hash/4k: Add pmd_same callback for 4K page size

2023-07-05 Thread Hugh Dickins
x13c/0x520 > get_arg_page+0x80/0x1d0 > copy_string_kernel+0xc8/0x250 > kernel_execve+0x11c/0x270 > run_init_process+0xe4/0x10c > kernel_init+0xbc/0x1a0 > ret_from_kernel_user_thread+0x14/0x1c > > Cc: Hugh Dickins > Reported-by: Michael Ellerman > Signed-off-by:

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-07-05 Thread Hugh Dickins
On Wed, 5 Jul 2023, Gerald Schaefer wrote: > On Tue, 4 Jul 2023 10:03:57 -0700 (PDT) > Hugh Dickins wrote: > > On Tue, 4 Jul 2023, Gerald Schaefer wrote: > > > On Sat, 1 Jul 2023 21:32:38 -0700 (PDT) > > > Hugh Dickins wrote: > > > &g

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-07-05 Thread Hugh Dickins
On Wed, 5 Jul 2023, Alexander Gordeev wrote: > On Sat, Jul 01, 2023 at 09:32:38PM -0700, Hugh Dickins wrote: > > On Thu, 29 Jun 2023, Hugh Dickins wrote: > > Hi Hugh, > > ... > > > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > > +void pte_free_defer(str

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-07-04 Thread Hugh Dickins
On Tue, 4 Jul 2023, Gerald Schaefer wrote: > On Sat, 1 Jul 2023 21:32:38 -0700 (PDT) > Hugh Dickins wrote: > > On Thu, 29 Jun 2023, Hugh Dickins wrote: > > > > > > I've grown to dislike the (ab)use of pt_frag_refcount even more, to the > > > extent

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-07-04 Thread Hugh Dickins
On Tue, 4 Jul 2023, Alexander Gordeev wrote: > On Sat, Jul 01, 2023 at 09:32:38PM -0700, Hugh Dickins wrote: > > On Thu, 29 Jun 2023, Hugh Dickins wrote: > > Hi Hugh, > > ... > > No, not quite the same rules as before: I came to realize that using > > list_add

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-07-01 Thread Hugh Dickins
On Thu, 29 Jun 2023, Hugh Dickins wrote: > > I've grown to dislike the (ab)use of pt_frag_refcount even more, to the > extent that I've not even tried to verify it; but I think I do get the > point now, that we need further info than just PPHHAA to know whether > the p

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-06-30 Thread Hugh Dickins
On Fri, 30 Jun 2023, Claudio Imbrenda wrote: > On Fri, 30 Jun 2023 08:28:54 -0700 (PDT) > Hugh Dickins wrote: > > On Fri, 30 Jun 2023, Claudio Imbrenda wrote: > > > On Tue, 20 Jun 2023 00:51:19 -0700 (PDT) > > > Hugh Dickins wrote: > > > > > > [.

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-06-30 Thread Hugh Dickins
On Fri, 30 Jun 2023, Claudio Imbrenda wrote: > On Tue, 20 Jun 2023 00:51:19 -0700 (PDT) > Hugh Dickins wrote: > > [...] > > > +void pte_free_defer(struct mm_struct *mm, pgtable_t pgtable) > > +{ > > + unsigned int bit, mask; > > + struct page *page; &g

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-06-29 Thread Hugh Dickins
On Thu, 29 Jun 2023, Gerald Schaefer wrote: > On Thu, 29 Jun 2023 12:22:24 -0300 > Jason Gunthorpe wrote: > > On Wed, Jun 28, 2023 at 10:08:08PM -0700, Hugh Dickins wrote: > > > On Wed, 28 Jun 2023, Gerald Schaefer wrote: > > > > > > > > As disc

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-06-28 Thread Hugh Dickins
On Wed, 28 Jun 2023, Gerald Schaefer wrote: > > As discussed in the other thread, we would rather go with less complexity, > possibly switching to an approach w/o the list and fragment re-use in the > future. For now, as a first step in that direction, we can try with not > adding fragments back o

Re: [PATCH v2 05/12] powerpc: add pte_free_defer() for pgtables sharing page

2023-06-27 Thread Hugh Dickins
On Tue, 27 Jun 2023, Jason Gunthorpe wrote: > On Wed, Jun 21, 2023 at 07:36:11PM -0700, Hugh Dickins wrote: > > [PATCH v3 05/12] powerpc: add pte_free_defer() for pgtables sharing page ... > Yes, this makes sense to me, very simple.. > > I always for get these details but atomic

Re: [PATCH v6 00/33] Split ptdesc from struct page

2023-06-27 Thread Hugh Dickins
On Tue, 27 Jun 2023, Matthew Wilcox wrote: > On Mon, Jun 26, 2023 at 09:44:08PM -0700, Hugh Dickins wrote: > > On Mon, 26 Jun 2023, Vishal Moola (Oracle) wrote: > > > > > The MM subsystem is trying to shrink struct page. This patchset > > > introduces a memory d

Re: [PATCH v6 00/33] Split ptdesc from struct page

2023-06-27 Thread Hugh Dickins
On Tue, 27 Jun 2023, David Hildenbrand wrote: > On 27.06.23 06:44, Hugh Dickins wrote: > > On Mon, 26 Jun 2023, Vishal Moola (Oracle) wrote: > > > >> The MM subsystem is trying to shrink struct page. This patchset > >> introduces a memory descriptor for pa

Re: [PATCH v6 00/33] Split ptdesc from struct page

2023-06-26 Thread Hugh Dickins
On Mon, 26 Jun 2023, Vishal Moola (Oracle) wrote: > The MM subsystem is trying to shrink struct page. This patchset > introduces a memory descriptor for page table tracking - struct ptdesc. ... > 39 files changed, 686 insertions(+), 455 deletions(-) I don't see the point of this patchset: to me

Re: [PATCH v2 05/12] powerpc: add pte_free_defer() for pgtables sharing page

2023-06-21 Thread Hugh Dickins
On Tue, 20 Jun 2023, Jason Gunthorpe wrote: > On Tue, Jun 20, 2023 at 12:54:25PM -0700, Hugh Dickins wrote: > > On Tue, 20 Jun 2023, Jason Gunthorpe wrote: > > > On Tue, Jun 20, 2023 at 12:47:54AM -0700, Hugh Dickins wrote: > > > > Add powerpc-specific pte_free_d

Re: [PATCH v2 05/12] powerpc: add pte_free_defer() for pgtables sharing page

2023-06-20 Thread Hugh Dickins
On Tue, 20 Jun 2023, Jason Gunthorpe wrote: > On Tue, Jun 20, 2023 at 12:47:54AM -0700, Hugh Dickins wrote: > > Add powerpc-specific pte_free_defer(), to call pte_free() via call_rcu(). > > pte_free_defer() will be called inside khugepaged's retract_page_tables() > > lo

[PATCH mm 10/12] mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()

2023-06-20 Thread Hugh Dickins
". But with those entries now cleared, "step 4" (after dropping ptl to do pmd_lock) is kept safe by the huge page lock, which stops new PTEs from being faulted in. Signed-off-by: Hugh Dickins --- This is the version which applies to mm-everything or linux-next. mm/khugepaged.c |

[PATCH v2 12/12] mm: delete mmap_write_trylock() and vma_try_start_write()

2023-06-20 Thread Hugh Dickins
mmap_write_trylock() and vma_try_start_write() were added just for khugepaged, but now it has no use for them: delete. Signed-off-by: Hugh Dickins --- include/linux/mm.h| 17 - include/linux/mmap_lock.h | 10 -- 2 files changed, 27 deletions(-) diff --git a

[PATCH v2 11/12] mm/khugepaged: delete khugepaged_collapse_pte_mapped_thps()

2023-06-20 Thread Hugh Dickins
recollapsed. Call collapse_pte_mapped_thp() directly in this case (why was it deferred before? I assume an issue with needing mmap_lock for write, but now it's only needed for read). Signed-off-by: Hugh Dickins --- mm/khugepaged.c | 125 +++- 1 file ch

[PATCH v2 10/12] mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()

2023-06-20 Thread Hugh Dickins
". But with those entries now cleared, "step 4" (after dropping ptl to do pmd_lock) is kept safe by the huge page lock, which stops new PTEs from being faulted in. Signed-off-by: Hugh Dickins --- mm/khugepaged.c | 172 ++-- 1 file

[PATCH v2 09/12] mm/khugepaged: retract_page_tables() without mmap or vma lock

2023-06-20 Thread Hugh Dickins
nhanced to replace_page_tables(), which inserts the final huge pmd without mmap lock: going through an invalid state instead of pmd_none() followed by fault. But that enhancement does raise some more questions: leave it until a later release. Signed-off-by: Hugh Dickins --- mm/khugepag

[PATCH v2 08/12] mm/pgtable: add pte_free_defer() for pgtable as page

2023-06-20 Thread Hugh Dickins
table (none of whose pte_free()s use the mm arg which was passed to it). Signed-off-by: Hugh Dickins --- include/linux/mm_types.h | 4 include/linux/pgtable.h | 2 ++ mm/pgtable-generic.c | 20 3 files changed, 26 insertions(+) diff --git a/include/linux/mm_type

[PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-06-20 Thread Hugh Dickins
, use a static global mm_pgtable_list_lock instead: then a soon-to-follow commit will split it per-mm as before (probably by using a SLAB_TYPESAFE_BY_RCU structure for the list head and its lock); and update the commentary on the pgtable_list. Signed-off-by: Hugh Dickins --- arch/s390/inc

[PATCH v2 06/12] sparc: add pte_free_defer() for pte_t *pgtable_t

2023-06-20 Thread Hugh Dickins
ble_t. sparc32 supports pagetables sharing a page, but does not support THP; sparc64 supports THP, but does not support pagetables sharing a page. So the sparc-specific pte_free_defer() is as simple as the generic one, except for converting between pte_t *pgtable_t and struct page *. Signed-off-by:

[PATCH v2 05/12] powerpc: add pte_free_defer() for pgtables sharing page

2023-06-20 Thread Hugh Dickins
ng if more deferrals arrived during its grace period. Signed-off-by: Hugh Dickins --- arch/powerpc/include/asm/pgalloc.h | 4 +++ arch/powerpc/mm/pgtable-frag.c | 51 ++ 2 files changed, 55 insertions(+) diff --git a/arch/powerpc/include/asm/pgalloc.h b/arch/po

[PATCH v2 04/12] powerpc: assert_pte_locked() use pte_offset_map_nolock()

2023-06-20 Thread Hugh Dickins
know, if an assert_pte_locked() caller can be racing such transitions? This mod might cause new crashes: which either expose my ignorance, or indicate issues to be fixed, or limit the usage of assert_pte_locked(). Signed-off-by: Hugh Dickins --- arch/powerpc/mm/pgtable.c | 16 ++

[PATCH v2 03/12] arm: adjust_pte() use pte_offset_map_nolock()

2023-06-20 Thread Hugh Dickins
-by: Hugh Dickins --- arch/arm/mm/fault-armv.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c index ca5302b0b7ee..7cb125497976 100644 --- a/arch/arm/mm/fault-armv.c +++ b/arch/arm/mm/fault-armv.c @@ -117,11 +117,10 @@ static

[PATCH v2 02/12] mm/pgtable: add PAE safety to __pte_offset_map()

2023-06-20 Thread Hugh Dickins
ssumptions). Signed-off-by: Hugh Dickins --- include/linux/pgtable.h | 4 mm/pgtable-generic.c| 29 + 2 files changed, 33 insertions(+) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 8b0fc7fdc46f..525f1782b466 100644 --- a/include/linux

[PATCH v2 01/12] mm/pgtable: add rcu_read_lock() and rcu_read_unlock()s

2023-06-20 Thread Hugh Dickins
-by: Hugh Dickins --- include/linux/pgtable.h | 4 ++-- mm/pgtable-generic.c| 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index a1326e61d7ee..8b0fc7fdc46f 100644 --- a/include/linux/pgtable.h +++ b/include/linux

[PATCH v2 00/12] mm: free retracted page table by RCU

2023-06-20 Thread Hugh Dickins
Here is v2 third series of patches to mm (and a few architectures), based on v6.4-rc5 with the preceding two series applied: in which khugepaged takes advantage of pte_offset_map[_lock]() allowing for pmd transitions. Differences from v1 are noted patch by patch below This follows on from the v2 "

[PATCH v2 07/23 replacement] mips: add pte_unmap() to balance pte_offset_map()

2023-06-15 Thread Hugh Dickins
: Nathan Chancellor Signed-off-by: Hugh Dickins --- Andrew, please replace my mips patch, and its build warning fix patch, in mm-unstable by this less ambitious but working replacement - thanks. arch/mips/mm/tlb-r4k.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff

Re: [PATCH v2 07/23] mips: update_mmu_cache() can replace __update_tlb()

2023-06-15 Thread Hugh Dickins
On Thu, 15 Jun 2023, Nathan Chancellor wrote: > On Wed, Jun 14, 2023 at 10:43:30PM -0700, Hugh Dickins wrote: > > > > I do hope that you find the first fixes the breakage; but if not, then > > I hate to be the bearer of bad news but the first patch did not fix the >

Re: [PATCH v4 04/34] pgtable: Create struct ptdesc

2023-06-15 Thread Hugh Dickins
On Mon, 12 Jun 2023, Vishal Moola (Oracle) wrote: > Currently, page table information is stored within struct page. As part > of simplifying struct page, create struct ptdesc for page table > information. > > Signed-off-by: Vishal Moola (Oracle) Vishal, as I think you have already guessed, your

Re: [PATCH v2 07/23] mips: update_mmu_cache() can replace __update_tlb()

2023-06-14 Thread Hugh Dickins
On Wed, 14 Jun 2023, Hugh Dickins wrote: > On Wed, 14 Jun 2023, Nathan Chancellor wrote: > > > > I just bisected a crash while powering down a MIPS machine in QEMU to > > this change as commit 8044511d3893 ("mips: update_mmu_cache() can > > replace __update_tlb()

Re: [PATCH v2 07/23] mips: update_mmu_cache() can replace __update_tlb()

2023-06-14 Thread Hugh Dickins
On Wed, 14 Jun 2023, Nathan Chancellor wrote: > Hi Hugh, > > On Thu, Jun 08, 2023 at 12:17:24PM -0700, Hugh Dickins wrote: > > Don't make update_mmu_cache() a wrapper around __update_tlb(): call it > > directly, and use the ptep (or pmdp) provided by the caller,

[PATCH v2 07/23 fix] mips: update_mmu_cache() can replace __update_tlb(): fix

2023-06-09 Thread Hugh Dickins
in advance! Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202306091304.cnvispk0-...@intel.com/ Signed-off-by: Hugh Dickins --- arch/mips/mm/tlb-r4k.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/mips/mm/tlb-r4k.c b/arch/m

[PATCH v2 23/23] xtensa: add pte_unmap() to balance pte_offset_map()

2023-06-08 Thread Hugh Dickins
To keep balance in future, remember to pte_unmap() after a successful pte_offset_map(). And act as if get_pte_for_vaddr() really needs a map there, to read the pteval before "unmapping", to be sure page table is not removed. Signed-off-by: Hugh Dickins --- arch/xtensa/mm/tlb.c |

[PATCH v2 22/23] x86: sme_populate_pgd() use pte_offset_kernel()

2023-06-08 Thread Hugh Dickins
sme_populate_pgd() is an __init function for sme_encrypt_kernel(): it should use pte_offset_kernel() instead of pte_offset_map(), to avoid the question of whether a pte_unmap() will be needed to balance. Signed-off-by: Hugh Dickins --- arch/x86/mm/mem_encrypt_identity.c | 2 +- 1 file changed

[PATCH v2 21/23] x86: Allow get_locked_pte() to fail

2023-06-08 Thread Hugh Dickins
In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Signed-off-by: Hugh Dickins --- arch/x86/kernel/ldt.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/ldt.c b

[PATCH v2 20/23] sparc: iounit and iommu use pte_offset_kernel()

2023-06-08 Thread Hugh Dickins
iounit_alloc() and sbus_iommu_alloc() are working from pmd_off_k(), so should use pte_offset_kernel() instead of pte_offset_map(), to avoid the question of whether a pte_unmap() will be needed to balance. Signed-off-by: Hugh Dickins --- arch/sparc/mm/io-unit.c | 2 +- arch/sparc/mm/iommu.c

[PATCH v2 19/23] sparc: allow pte_offset_map() to fail

2023-06-08 Thread Hugh Dickins
In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Signed-off-by: Hugh Dickins --- arch/sparc/kernel/signal32.c | 2 ++ arch/sparc/mm/fault_64.c | 3 +++ arch/sparc/mm/tlb.c | 2 ++ 3 files

[PATCH v2 18/23] sparc/hugetlb: pte_alloc_huge() pte_offset_huge()

2023-06-08 Thread Hugh Dickins
pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits that: to keep balance in future, use the recently added pte_alloc_huge() instead; with pte_offset_huge() a better name for pte_offset_kernel(). Signed-off-by: Hugh Dickins --- arch/sparc/mm/hugetlbpage.c | 4 ++-- 1 file

[PATCH v2 17/23] sh/hugetlb: pte_alloc_huge() pte_offset_huge()

2023-06-08 Thread Hugh Dickins
pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits that: to keep balance in future, use the recently added pte_alloc_huge() instead; with pte_offset_huge() a better name for pte_offset_kernel(). Signed-off-by: Hugh Dickins --- arch/sh/mm/hugetlbpage.c | 4 ++-- 1 file

[PATCH v2 16/23] s390: gmap use pte_unmap_unlock() not spin_unlock()

2023-06-08 Thread Hugh Dickins
pte_alloc_map_lock() expects to be followed by pte_unmap_unlock(): to keep balance in future, pass ptep as well as ptl to gmap_pte_op_end(), and use pte_unmap_unlock() instead of direct spin_unlock() (even though ptep ends up unused inside the macro). Signed-off-by: Hugh Dickins Acked-by

[PATCH v2 15/23] s390: allow pte_offset_map_lock() to fail

2023-06-08 Thread Hugh Dickins
In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Add comment on mm's contract with s390 above __zap_zero_pages(), and fix old comment there: must be called after THP was disabled. Signed-off-by:

[PATCH v2 14/23] riscv/hugetlb: pte_alloc_huge() pte_offset_huge()

2023-06-08 Thread Hugh Dickins
pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits that: to keep balance in future, use the recently added pte_alloc_huge() instead; with pte_offset_huge() a better name for pte_offset_kernel(). Signed-off-by: Hugh Dickins Reviewed-by: Alexandre Ghiti Acked-by: Palmer

[PATCH v2 13/23] powerpc/hugetlb: pte_alloc_huge()

2023-06-08 Thread Hugh Dickins
d. Signed-off-by: Hugh Dickins --- arch/powerpc/mm/hugetlbpage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index b900933507da..f7c683b672c1 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hug

[PATCH v2 12/23] powerpc: allow pte_offset_map[_lock]() to fail

2023-06-08 Thread Hugh Dickins
In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Balance successful pte_offset_map() with pte_unmap() where omitted. Signed-off-by: Hugh Dickins --- arch/powerpc/mm/book3s64/hash_tlb.c | 4 arch

[PATCH v2 11/23] powerpc: kvmppc_unmap_free_pmd() pte_offset_kernel()

2023-06-08 Thread Hugh Dickins
kvmppc races beween page table and huge entry, of the kind which we are expecting to address in pte_offset_map() - this might want to be revisited in future. Signed-off-by: Hugh Dickins --- arch/powerpc/kvm/book3s_64_mmu_radix.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a

[PATCH v2 10/23] parisc/hugetlb: pte_alloc_huge() pte_offset_huge()

2023-06-08 Thread Hugh Dickins
pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits that: to keep balance in future, use the recently added pte_alloc_huge() instead; with pte_offset_huge() a better name for pte_offset_kernel(). Signed-off-by: Hugh Dickins --- arch/parisc/mm/hugetlbpage.c | 4 ++-- 1 file

[PATCH v2 09/23] parisc: unmap_uncached_pte() use pte_offset_kernel()

2023-06-08 Thread Hugh Dickins
unmap_uncached_pte() is working from pgd_offset_k(vaddr), so it should use pte_offset_kernel() instead of pte_offset_map(), to avoid the question of whether a pte_unmap() will be needed to balance. Signed-off-by: Hugh Dickins --- arch/parisc/kernel/pci-dma.c | 2 +- 1 file changed, 1 insertion

[PATCH v2 08/23] parisc: add pte_unmap() to balance get_ptep()

2023-06-08 Thread Hugh Dickins
To keep balance in future, remember to pte_unmap() after a successful get_ptep(). And act as if flush_cache_pages() really needs a map there, to read the pfn before "unmapping", to be sure page table is not removed. Signed-off-by: Hugh Dickins --- arch/parisc/kernel/ca

[PATCH v2 07/23] mips: update_mmu_cache() can replace __update_tlb()

2023-06-08 Thread Hugh Dickins
provided by the caller is actually the pmdp, instead of testing pmd_huge(): or test pmd_huge() too and warn if it disagrees? This is "hazardous" territory: needs review and testing. Signed-off-by: Hugh Dickins --- arch/mips/include/asm/pgtable.h | 15 +++ arch/mips/mm/tlb-r3k.c

[PATCH v2 06/23] microblaze: allow pte_offset_map() to fail

2023-06-08 Thread Hugh Dickins
In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Signed-off-by: Hugh Dickins --- arch/microblaze/kernel/signal.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/microblaze

[PATCH v2 05/23] m68k: allow pte_offset_map[_lock]() to fail

2023-06-08 Thread Hugh Dickins
In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Restructure cf_tlb_miss() with a pte_unmap() (previously omitted) at label out, followed by one local_irq_restore() for all. Signed-off-by: Hugh Dickins

[PATCH v2 04/23] ia64/hugetlb: pte_alloc_huge() pte_offset_huge()

2023-06-08 Thread Hugh Dickins
pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits that: to keep balance in future, use the recently added pte_alloc_huge() instead; with pte_offset_huge() a better name for pte_offset_kernel(). Signed-off-by: Hugh Dickins --- arch/ia64/mm/hugetlbpage.c | 4 ++-- 1 file

[PATCH v2 03/23] arm64/hugetlb: pte_alloc_huge() pte_offset_huge()

2023-06-08 Thread Hugh Dickins
pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits that: to keep balance in future, use the recently added pte_alloc_huge() instead; with pte_offset_huge() a better name for pte_offset_kernel(). Signed-off-by: Hugh Dickins Acked-by: Catalin Marinas --- arch/arm64/mm

[PATCH v2 02/23] arm64: allow pte_offset_map() to fail

2023-06-08 Thread Hugh Dickins
In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Signed-off-by: Hugh Dickins Acked-by: Catalin Marinas --- arch/arm64/mm/fault.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm64/mm

[PATCH v2 01/23] arm: allow pte_offset_map[_lock]() to fail

2023-06-08 Thread Hugh Dickins
In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Signed-off-by: Hugh Dickins --- arch/arm/lib/uaccess_with_memcpy.c | 3 +++ arch/arm/mm/fault-armv.c | 5 - arch/arm/mm/fault.c

[PATCH v2 00/23] arch: allow pte_offset_map[_lock]() to fail

2023-06-08 Thread Hugh Dickins
Here is v2 series of patches to various architectures, based on v6.4-rc5: preparing for v2 of changes following in mm, affecting pte_offset_map() and pte_offset_map_lock(). There are very few differences from v1: noted patch by patch below. v1 was "arch: allow pte_offset_map[_lock]() to fail" htt

Re: [PATCH 07/12] s390: add pte_free_defer(), with use of mmdrop_async()

2023-06-07 Thread Hugh Dickins
On Tue, 6 Jun 2023, Gerald Schaefer wrote: > On Mon, 5 Jun 2023 22:11:52 -0700 (PDT) > Hugh Dickins wrote: > > On Thu, 1 Jun 2023 15:57:51 +0200 > > Gerald Schaefer wrote: > > > > > > Yes, we have 2 pagetables in one 4K page, which could result in same > &

Re: [PATCH 07/12] s390: add pte_free_defer(), with use of mmdrop_async()

2023-06-07 Thread Hugh Dickins
On Tue, 6 Jun 2023, Jason Gunthorpe wrote: > On Mon, Jun 05, 2023 at 10:11:52PM -0700, Hugh Dickins wrote: > > > "deposited" pagetable fragments, over in arch/s390/mm/pgtable.c: use > > the first two longs of the page table itself for threading the list. > > It

Re: [PATCH 05/12] powerpc: add pte_free_defer() for pgtables sharing page

2023-06-06 Thread Hugh Dickins
On Tue, 6 Jun 2023, Jason Gunthorpe wrote: > On Tue, Jun 06, 2023 at 03:03:31PM -0400, Peter Xu wrote: > > On Tue, Jun 06, 2023 at 03:23:30PM -0300, Jason Gunthorpe wrote: > > > On Mon, Jun 05, 2023 at 08:40:01PM -0700, Hugh Dickins wrote: > > > > > > > diff

Re: [PATCH 00/12] mm: free retracted page table by RCU

2023-06-05 Thread Hugh Dickins
On Fri, 2 Jun 2023, Jann Horn wrote: > On Fri, Jun 2, 2023 at 6:37 AM Hugh Dickins wrote: > > > The most obvious vital thing (in the split ptlock case) is that it > > remains a struct page with a usable ptl spinlock embedded in it. > > > > The question becomes mo

Re: [PATCH 09/12] mm/khugepaged: retract_page_tables() without mmap or vma lock

2023-06-05 Thread Hugh Dickins
On Wed, 31 May 2023, Jann Horn wrote: > On Mon, May 29, 2023 at 8:25 AM Hugh Dickins wrote: > > +static void retract_page_tables(struct address_space *mapping, pgoff_t > > pgoff) ... > > +* Note that vma->anon_vma check is racy: it can be set > > af

Re: [PATCH 07/12] s390: add pte_free_defer(), with use of mmdrop_async()

2023-06-05 Thread Hugh Dickins
On Sun, 28 May 2023, Hugh Dickins wrote: > Add s390-specific pte_free_defer(), to call pte_free() via call_rcu(). > pte_free_defer() will be called inside khugepaged's retract_page_tables() > loop, where allocating extra memory cannot be relied upon. This precedes > the generic

Re: [PATCH 06/12] sparc: add pte_free_defer() for pgtables sharing page

2023-06-05 Thread Hugh Dickins
On Sun, 28 May 2023, Hugh Dickins wrote: > Add sparc-specific pte_free_defer(), to call pte_free() via call_rcu(). > pte_free_defer() will be called inside khugepaged's retract_page_tables() > loop, where allocating extra memory cannot be relied upon. This precedes > the generic

Re: [PATCH 05/12] powerpc: add pte_free_defer() for pgtables sharing page

2023-06-05 Thread Hugh Dickins
On Fri, 2 Jun 2023, Jason Gunthorpe wrote: > On Mon, May 29, 2023 at 03:02:02PM +0100, Matthew Wilcox wrote: > > On Sun, May 28, 2023 at 11:20:21PM -0700, Hugh Dickins wrote: > > > +void pte_free_defer(struct mm_struct *mm, pgtable_t pgtable) > > > +

  1   2   3   >