Re: [PATCH] mm/hugetlb: bring gigantic page allocation under hugepages_supported()

2025-01-22 Thread Gerald Schaefer
On Tue, 21 Jan 2025 20:34:19 +0530 Sourabh Jain wrote: > Despite having kernel arguments to enable gigantic hugepages, this > provides a way for the architecture to disable gigantic hugepages on the > fly, similar to what we do for hugepages. > > Components like fadump (PowerPC-specific) need th

Re: [PATCH 05/12] mm/memory: Add dax_insert_pfn

2024-10-01 Thread Gerald Schaefer
On Sun, 22 Sep 2024 03:41:57 +0200 Dan Williams wrote: > [ add s390 folks to comment on CONFIG_FS_DAX_LIMITED ] [...] > > @@ -2516,6 +2545,44 @@ static vm_fault_t __vm_insert_mixed(struct > > vm_area_struct *vma, > > return VM_FAULT_NOPAGE; > > } > > > > +vm_fault_t dax_insert_pfn(struc

Re: [PATCH 05/12] mm/memory: Add dax_insert_pfn

2024-10-01 Thread Gerald Schaefer
On Sun, 22 Sep 2024 03:41:57 +0200 Dan Williams wrote: > [ add s390 folks to comment on CONFIG_FS_DAX_LIMITED ] [...] > > @@ -2516,6 +2545,44 @@ static vm_fault_t __vm_insert_mixed(struct > > vm_area_struct *vma, > > return VM_FAULT_NOPAGE; > > } > > > > +vm_fault_t dax_insert_pfn(struc

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-07-07 Thread Gerald Schaefer
On Wed, 5 Jul 2023 17:52:40 -0700 (PDT) Hugh Dickins wrote: > On Wed, 5 Jul 2023, Alexander Gordeev wrote: > > On Sat, Jul 01, 2023 at 09:32:38PM -0700, Hugh Dickins wrote: > > > On Thu, 29 Jun 2023, Hugh Dickins wrote: > > > > Hi Hugh, > > > > ... > > > > > +#ifdef CONFIG_TRANSPARENT_HU

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-07-06 Thread Gerald Schaefer
On Wed, 5 Jul 2023 18:20:21 -0700 (PDT) Hugh Dickins wrote: > On Wed, 5 Jul 2023, Gerald Schaefer wrote: > > On Tue, 4 Jul 2023 10:03:57 -0700 (PDT) > > Hugh Dickins wrote: > > > On Tue, 4 Jul 2023, Gerald Schaefer wrote: > > > > On Sat, 1 Jul 2023 21:32

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-07-05 Thread Gerald Schaefer
On Tue, 4 Jul 2023 10:03:57 -0700 (PDT) Hugh Dickins wrote: > On Tue, 4 Jul 2023, Gerald Schaefer wrote: > > On Sat, 1 Jul 2023 21:32:38 -0700 (PDT) > > Hugh Dickins wrote: > > > On Thu, 29 Jun 2023, Hugh Dickins wrote: > > > > > > > > I

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-07-04 Thread Gerald Schaefer
On Sat, 1 Jul 2023 21:32:38 -0700 (PDT) Hugh Dickins wrote: > On Thu, 29 Jun 2023, Hugh Dickins wrote: > > > > I've grown to dislike the (ab)use of pt_frag_refcount even more, to the > > extent that I've not even tried to verify it; but I think I do get the > > point now, that we need further in

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-07-03 Thread Gerald Schaefer
On Thu, 29 Jun 2023 23:00:07 -0700 (PDT) Hugh Dickins wrote: > On Thu, 29 Jun 2023, Gerald Schaefer wrote: > > On Thu, 29 Jun 2023 12:22:24 -0300 > > Jason Gunthorpe wrote: > > > On Wed, Jun 28, 2023 at 10:08:08PM -0700, Hugh Dickins wrote: > > > > On Wed

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-06-29 Thread Gerald Schaefer
On Thu, 29 Jun 2023 12:22:24 -0300 Jason Gunthorpe wrote: > On Wed, Jun 28, 2023 at 10:08:08PM -0700, Hugh Dickins wrote: > > On Wed, 28 Jun 2023, Gerald Schaefer wrote: > > > > > > As discussed in the other thread, we would rather go with less complexity, >

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-06-29 Thread Gerald Schaefer
On Thu, 29 Jun 2023 15:59:07 +0200 Alexander Gordeev wrote: > On Wed, Jun 28, 2023 at 09:16:24PM +0200, Gerald Schaefer wrote: > > On Tue, 20 Jun 2023 00:51:19 -0700 (PDT) > > Hugh Dickins wrote: > > Hi Gerald, Hugh! > > ... > > @@ -407,6 +445,88 @@ voi

Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-06-28 Thread Gerald Schaefer
alf. Not adding back unallocated fragments to the list in pte_free_defer() can result in wasting some amount of memory for pagetables, depending on how long the allocated fragment will stay in use. In practice, this effect is expected to be insignificant, and not justify a far more complex approac

Re: [PATCH 07/12] s390: add pte_free_defer(), with use of mmdrop_async()

2023-06-08 Thread Gerald Schaefer
On Wed, 7 Jun 2023 20:35:05 -0700 (PDT) Hugh Dickins wrote: > On Tue, 6 Jun 2023, Gerald Schaefer wrote: > > On Mon, 5 Jun 2023 22:11:52 -0700 (PDT) > > Hugh Dickins wrote: > > > On Thu, 1 Jun 2023 15:57:51 +0200 > > > Gerald Schaefer wrote: > > &g

Re: [PATCH 07/12] s390: add pte_free_defer(), with use of mmdrop_async()

2023-06-06 Thread Gerald Schaefer
k & 0x03U) > - list_add_tail(&page->lru, &mm->context.pgtable_list); > - else > - list_del(&page->lru); > + if (mask & 0x03U) { > + listed = (struct list_head *)table; > + list_add_tail(listed, &mm->context.pgtable_list); > + } else { > + /* > + * Get address of the other page table sharing the page. > + * There are sure to be MUCH better ways to do all this! > + * But I'm rushing, and trying to keep to the obvious. > + */ > + listed = (struct list_head *)(table + PTRS_PER_PTE); > + if (virt_to_page(listed) != page) { > + /* sizeof(*listed) is twice sizeof(*table) */ > + listed -= PTRS_PER_PTE; > + } Same as above. > + list_del(listed); > + set_pte((pte_t *)&listed->next, __pte(_PAGE_INVALID)); > + set_pte((pte_t *)&listed->prev, __pte(_PAGE_INVALID)); > + } > spin_unlock_bh(&mm->context.lock); > table = (unsigned long *) ((unsigned long) table | (0x01U << bit)); > tlb_remove_table(tlb, table); Reviewed-by: Gerald Schaefer

Re: [PATCH 05/12] powerpc: add pte_free_defer() for pgtables sharing page

2023-06-01 Thread Gerald Schaefer
On Mon, 29 May 2023 07:36:40 -0700 (PDT) Hugh Dickins wrote: > On Mon, 29 May 2023, Matthew Wilcox wrote: > > On Sun, May 28, 2023 at 11:20:21PM -0700, Hugh Dickins wrote: > > > +void pte_free_defer(struct mm_struct *mm, pgtable_t pgtable) > > > +{ > > > + struct page *page; > > > + > > > + pag

Re: [PATCH v3 03/34] s390: Use pt_frag_refcount for pagetables

2023-06-01 Thread Gerald Schaefer
On Wed, 31 May 2023 14:30:01 -0700 "Vishal Moola (Oracle)" wrote: > s390 currently uses _refcount to identify fragmented page tables. > The page table struct already has a member pt_frag_refcount used by > powerpc, so have s390 use that instead of the _refcount field as well. > This improves the

Re: [PATCH v3 03/34] s390: Use pt_frag_refcount for pagetables

2023-06-01 Thread Gerald Schaefer
On Wed, 31 May 2023 14:30:01 -0700 "Vishal Moola (Oracle)" wrote: > s390 currently uses _refcount to identify fragmented page tables. > The page table struct already has a member pt_frag_refcount used by > powerpc, so have s390 use that instead of the _refcount field as well. > This improves the

Re: [PATCH v3 03/34] s390: Use pt_frag_refcount for pagetables

2023-06-01 Thread Gerald Schaefer
On Wed, 31 May 2023 14:30:01 -0700 "Vishal Moola (Oracle)" wrote: > s390 currently uses _refcount to identify fragmented page tables. > The page table struct already has a member pt_frag_refcount used by > powerpc, so have s390 use that instead of the _refcount field as well. > This improves the

Re: [PATCH] mm: add PTE pointer parameter to flush_tlb_fix_spurious_fault()

2023-03-06 Thread Gerald Schaefer
On Mon, 6 Mar 2023 17:06:44 + Catalin Marinas wrote: > On Mon, Mar 06, 2023 at 05:15:48PM +0100, Gerald Schaefer wrote: > > diff --git a/arch/arm64/include/asm/pgtable.h > > b/arch/arm64/include/asm/pgtable.h > > index b6ba466e2e8a..0bd18de9fd97 100644 > > -

[PATCH] mm: add PTE pointer parameter to flush_tlb_fix_spurious_fault()

2023-03-06 Thread Gerald Schaefer
private flush_tlb_fix_spurious_fault() implementations need to be made aware of the new parameter. Reviewed-by: Alexander Gordeev Signed-off-by: Gerald Schaefer --- arch/arm64/include/asm/pgtable.h | 2 +- arch/mips/include/asm/pgtable.h | 3 ++- arch/powerpc/include/asm

Re: [f2fs-dev] [PATCH 06/10] hugetlbfs: Convert remove_inode_hugepages() to use filemap_get_folios()

2022-06-10 Thread Gerald Schaefer
On Fri, 10 Jun 2022 17:52:05 +0200 Sumanth Korikkar wrote: [...] > > * Bisected the crash to this commit. > > To reproduce: > * clone libhugetlbfs: > * Execute, PATH=$PATH:"obj64/" LD_LIBRARY_PATH=../obj64/ > alloc-instantiate-race shared > > Crashes on both s390 and x86. FWIW, not really

Re: [PATCH 3/3] mm: rmap: Fix CONT-PTE/PMD size hugetlb issue when unmapping

2022-05-03 Thread Gerald Schaefer
On Tue, 3 May 2022 10:19:46 +0800 Baolin Wang wrote: > > > On 5/2/2022 10:02 PM, Gerald Schaefer wrote: > > On Sat, 30 Apr 2022 11:22:33 +0800 > > Baolin Wang wrote: > > > >> > >> > >> On 4/30/2022 4:02 AM, Gerald Schaefer wrote: >

Re: [PATCH 3/3] mm: rmap: Fix CONT-PTE/PMD size hugetlb issue when unmapping

2022-05-02 Thread Gerald Schaefer
On Sat, 30 Apr 2022 11:22:33 +0800 Baolin Wang wrote: > > > On 4/30/2022 4:02 AM, Gerald Schaefer wrote: > > On Fri, 29 Apr 2022 16:14:43 +0800 > > Baolin Wang wrote: > > > >> On some architectures (like ARM64), it can support CONT-PTE/PMD size > >

Re: [PATCH 3/3] mm: rmap: Fix CONT-PTE/PMD size hugetlb issue when unmapping

2022-04-29 Thread Gerald Schaefer
On Fri, 29 Apr 2022 16:14:43 +0800 Baolin Wang wrote: > On some architectures (like ARM64), it can support CONT-PTE/PMD size > hugetlb, which means it can support not only PMD/PUD size hugetlb: > 2M and 1G, but also CONT-PTE/PMD size: 64K and 32M if a 4K page > size specified. > > When unmapping

Re: [PATCH v2 6/8] s390/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-03-30 Thread Gerald Schaefer
5|5|55566|66| > * |0123456789012345678901234567890123456789012345678901|23456|78901|23| > * > * Bits 0-51 store the offset. > + * Bit 52 (E) is used to remember PG_anon_exclusive. > * Bits 57-61 store the type. > * Bit 62 (S) is used for softdirty tracking. > - * Bits 52, 55 and 56 (X) are unused. > + * Bits 55 and 56 (X) are unused. > */ > > #define __SWP_OFFSET_MASK((1UL << 52) - 1) Thanks David! Reviewed-by: Gerald Schaefer

Re: [PATCH v2 5/8] s390/pgtable: cleanup description of swp pte layout

2022-03-30 Thread Gerald Schaefer
34455|5|55566|66| > * |0123456789012345678901234567890123456789012345678901|23456|78901|23| > + * > + * Bits 0-51 store the offset. > + * Bits 57-61 store the type. > + * Bit 62 (S) is used for softdirty tracking. > + * Bits 52, 55 and 56 (X) are unused. > */ > > #define __SWP_OFFSET_MASK((1UL << 52) - 1) Thanks David! Reviewed-by: Gerald Schaefer

Re: [PATCH v1 5/7] s390/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-03-16 Thread Gerald Schaefer
On Wed, 16 Mar 2022 14:01:07 +0100 Christian Borntraeger wrote: > > > Am 16.03.22 um 11:56 schrieb Gerald Schaefer: > > On Tue, 15 Mar 2022 18:12:16 +0100 > > David Hildenbrand wrote: > > > >> On 15.03.22 17:58, David Hildenbrand wrote: > >>> &

Re: [PATCH v1 5/7] s390/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-03-16 Thread Gerald Schaefer
On Tue, 15 Mar 2022 18:12:16 +0100 David Hildenbrand wrote: > On 15.03.22 17:58, David Hildenbrand wrote: > > > >>> This would mean that it is not OK to have bit 52 not zero for swap PTEs. > >>> But if I read the POP correctly, all bits except for the DAT-protection > >>> would be ignored for in

Re: [PATCH v1 5/7] s390/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-03-15 Thread Gerald Schaefer
On Tue, 15 Mar 2022 15:18:35 +0100 David Hildenbrand wrote: > Let's steal one bit from the offset. While at it, document the meaning > of bit 62 for swap ptes. You define _PAGE_SWP_EXCLUSIVE as _PAGE_LARGE, which is bit 52, and this is not part of the swap pte offset IIUC. So stealing any bit mi

Re: DPAA2 triggers, [PATCH] dma debug: report -EEXIST errors in add_dma_entry

2021-10-07 Thread Gerald Schaefer
On Thu, 7 Oct 2021 12:59:32 +0200 Karsten Graul wrote: [...] > > > >>> BTW, there is already a WARN in the add_dma_entry() path, related > >>> to cachlline overlap and -EEXIST: > >>> > >>> add_dma_entry() -> active_cacheline_insert() -> -EEXIST -> > >>> active_cacheline_inc_overlap() > >>> > >>>

[PATCH] dma-debug: fix sg checks in debug_dma_map_sg()

2021-10-06 Thread Gerald Schaefer
dividual sg elements. Link: https://lore.kernel.org/lkml/20210705185252.4074653-1-gerald.schae...@linux.ibm.com Fixes: 884d05970bfb ("dma-debug: use sg_dma_len accessor") Signed-off-by: Gerald Schaefer --- kernel/dma/debug.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-)

Re: DPAA2 triggers, [PATCH] dma debug: report -EEXIST errors in add_dma_entry

2021-10-06 Thread Gerald Schaefer
On Wed, 6 Oct 2021 15:23:36 +0100 Robin Murphy wrote: > On 2021-10-06 14:10, Gerald Schaefer wrote: > > On Fri, 1 Oct 2021 14:52:56 +0200 > > Gerald Schaefer wrote: > > > >> On Thu, 30 Sep 2021 15:37:33 +0200 > >> Karsten Graul wrote: > >>

Re: DPAA2 triggers, [PATCH] dma debug: report -EEXIST errors in add_dma_entry

2021-10-06 Thread Gerald Schaefer
On Wed, 6 Oct 2021 15:10:43 +0200 Gerald Schaefer wrote: > On Fri, 1 Oct 2021 14:52:56 +0200 > Gerald Schaefer wrote: > > > On Thu, 30 Sep 2021 15:37:33 +0200 > > Karsten Graul wrote: > > > > > On 14/09/2021 17:45, Ioana Ciornei wrote: > > > >

Re: DPAA2 triggers, [PATCH] dma debug: report -EEXIST errors in add_dma_entry

2021-10-06 Thread Gerald Schaefer
On Fri, 1 Oct 2021 14:52:56 +0200 Gerald Schaefer wrote: > On Thu, 30 Sep 2021 15:37:33 +0200 > Karsten Graul wrote: > > > On 14/09/2021 17:45, Ioana Ciornei wrote: > > > On Wed, Sep 08, 2021 at 10:33:26PM -0500, Jeremy Linton wrote: > > >>

Re: DPAA2 triggers, [PATCH] dma debug: report -EEXIST errors in add_dma_entry

2021-10-01 Thread Gerald Schaefer
On Thu, 30 Sep 2021 15:37:33 +0200 Karsten Graul wrote: > On 14/09/2021 17:45, Ioana Ciornei wrote: > > On Wed, Sep 08, 2021 at 10:33:26PM -0500, Jeremy Linton wrote: > >> +DPAA2, netdev maintainers > >> Hi, > >> > >> On 5/18/21 7:54 AM, Hamza Mahfooz wrote: > >>> Since, overlapping mappings are

Re: can we finally kill off CONFIG_FS_DAX_LIMITED

2021-08-24 Thread Gerald Schaefer
On Tue, 24 Aug 2021 07:53:22 -0700 Dan Williams wrote: > On Tue, Aug 24, 2021 at 7:10 AM Joao Martins > wrote: > > > > > > > > On 8/23/21 9:21 PM, Dan Williams wrote: > > > On Mon, Aug 23, 2021 at 12:47 PM Gerald Schaefer > > > wrote:

Re: can we finally kill off CONFIG_FS_DAX_LIMITED

2021-08-23 Thread Gerald Schaefer
On Mon, 23 Aug 2021 16:05:46 +0200 Gerald Schaefer wrote: > On Fri, 20 Aug 2021 07:43:40 +0200 > Christoph Hellwig wrote: > > > Hi all, > > > > looking at the recent ZONE_DEVICE related changes we still have a > > horrible maze of different code paths. I a

Re: can we finally kill off CONFIG_FS_DAX_LIMITED

2021-08-23 Thread Gerald Schaefer
On Fri, 20 Aug 2021 07:43:40 +0200 Christoph Hellwig wrote: > Hi all, > > looking at the recent ZONE_DEVICE related changes we still have a > horrible maze of different code paths. I already suggested to > depend on ARCH_HAS_PTE_SPECIAL for ZONE_DEVICE there, which all modern > architectures ha

Re: can we finally kill off CONFIG_FS_DAX_LIMITED

2021-08-20 Thread Gerald Schaefer
On Fri, 20 Aug 2021 10:42:14 -0700 Dan Williams wrote: > [ fix Gerald's email ] > > On Fri, Aug 20, 2021 at 8:41 AM Dan Williams wrote: > > > > [ add Gerald and Joao ] > > > > On Thu, Aug 19, 2021 at 10:44 PM Christoph Hellwig wrote: > > > > > > Hi all, > > > > > > looking at the recent ZONE_D

Re: [RFC PATCH 1/1] dma-debug: fix check_for_illegal_area() in debug_dma_map_sg()

2021-07-06 Thread Gerald Schaefer
On Tue, 6 Jul 2021 10:22:40 +0100 Robin Murphy wrote: > On 2021-07-05 19:52, Gerald Schaefer wrote: > > The following warning occurred sporadically on s390: > > DMA-API: nvme 0006:00:00.0: device driver maps memory from kernel text or > > rodata [addr=48cc5e2f] [le

[RFC PATCH 1/1] dma-debug: fix check_for_illegal_area() in debug_dma_map_sg()

2021-07-05 Thread Gerald Schaefer
en(s). Also put the call to check_for_illegal_area() in a separate loop, iterating over all the individual sg elements ("nents" instead of "mapped_ents"). Fixes: 884d05970bfb ("dma-debug: use sg_dma_len accessor") Tested-by: Niklas Schnelle Signed-off-by:

[RFC PATCH 0/1] dma-debug: fix check_for_illegal_area() in debug_dma_map_sg()

2021-07-05 Thread Gerald Schaefer
t iterates over mapped_ents instead of nents. So it would not check all physical sg elements if any were combined in DMA address space. Gerald Schaefer (1): dma-debug: fix check_for_illegal_area() in debug_dma_map_sg() kernel/dma/debug.c | 10 ++ 1 file changed, 6 insertions(+), 4

Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-04-06 Thread Gerald Schaefer
On Thu, 1 Apr 2021 13:10:49 -0700 Yang Shi wrote: [...] > > > > > > Yes, it could be. The old behavior of migration was to return -ENOMEM > > > if THP migration is not supported then split THP. That behavior was > > > not very friendly to some usecases, for example, memory policy and > > > migrat

Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-03-31 Thread Gerald Schaefer
On Tue, 30 Mar 2021 09:51:46 -0700 Yang Shi wrote: > On Tue, Mar 30, 2021 at 7:42 AM Gerald Schaefer > wrote: > > > > On Mon, 29 Mar 2021 11:33:06 -0700 > > Yang Shi wrote: > > > > > > > > When the THP NUMA fault support was added THP migrati

Re: [PATCH 5/6] mm: migrate: don't split THP for misplaced NUMA page

2021-03-30 Thread Gerald Schaefer
On Mon, 29 Mar 2021 11:33:11 -0700 Yang Shi wrote: > The old behavior didn't split THP if migration is failed due to lack of > memory on the target node. But the THP migration does split THP, so keep > the old behavior for misplaced NUMA page migration. > > Signed-off-by: Yang Shi > --- > mm/

Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-03-30 Thread Gerald Schaefer
On Mon, 29 Mar 2021 11:33:06 -0700 Yang Shi wrote: > > When the THP NUMA fault support was added THP migration was not supported yet. > So the ad hoc THP migration was implemented in NUMA fault handling. Since > v4.14 > THP migration has been supported so it doesn't make too much sense to stil

Re: Freeing page tables through RCU

2021-02-26 Thread Gerald Schaefer
On Thu, 25 Feb 2021 20:58:20 + Matthew Wilcox wrote: > In order to walk the page tables without the mmap semaphore, it must > be possible to prevent them from being freed and reused (eg if munmap() > races with viewing /proc/$pid/smaps). > > There is various commentary within the mm on how t

Re: Freeing page tables through RCU

2021-02-26 Thread Gerald Schaefer
On Thu, 25 Feb 2021 20:58:20 + Matthew Wilcox wrote: > In order to walk the page tables without the mmap semaphore, it must > be possible to prevent them from being freed and reused (eg if munmap() > races with viewing /proc/$pid/smaps). > > There is various commentary within the mm on how t

Re: [RFC PATCH 1/5] hugetlb: add hugetlb helpers for soft dirty support

2021-02-24 Thread Gerald Schaefer
On Wed, 24 Feb 2021 17:46:08 +0100 Gerald Schaefer wrote: [...] > Then we fundamentally changed the way how we deal with that "hugetlb code > is treating pmds as ptes" issue. Instead of caring about that in all > huge_pte_xxx primitives, huge_ptep_get() will now return a nic

Re: [RFC PATCH 1/5] hugetlb: add hugetlb helpers for soft dirty support

2021-02-24 Thread Gerald Schaefer
On Wed, 17 Feb 2021 11:24:15 -0500 Peter Xu wrote: > On Wed, Feb 10, 2021 at 04:03:18PM -0800, Mike Kravetz wrote: > > Add interfaces to set and clear soft dirty in hugetlb ptes. Make > > hugetlb interfaces needed for /proc clear_refs available outside > > hugetlb.c. > > > > arch/s390 has it's

Re: [RFC] linux-next panic in hugepage_subpool_put_pages()

2021-02-23 Thread Gerald Schaefer
On Tue, 23 Feb 2021 15:57:40 +0100 Gerald Schaefer wrote: [...] > What I do not understand is how __free_huge_page() would be called at all > in the call trace below (set_max_huge_pages -> alloc_pool_huge_page -> > __free_huge_page -> hugepage_subpool_put_pages). From the co

[RFC] linux-next panic in hugepage_subpool_put_pages()

2021-02-23 Thread Gerald Schaefer
Hi, LTP triggered a panic on s390 in hugepage_subpool_put_pages() with linux-next 5.12.0-20210222, see below. It crashes on the spin_lock(&spool->lock) at the beginning, because the passed-in *spool points to 004e, which is not addressable memory. It rather looks like some flags and n

Re: [PATCH V2 0/2] mm/debug_vm_pgtable: Some minor updates

2020-12-09 Thread Gerald Schaefer
ux-mm/20201123142237.GF17833@gaia/ > > > > This series is based on v5.10-rc6 and has been tested on arm64 and x86 but > > has only been build tested on riscv, s390, arc etc. It would be great if > > folks could test this on these platforms as well. Thank you. > > > >

[RFC PATCH 1/1] mm/hugetlb: clear compound_nr before freeing gigantic pages

2020-12-08 Thread Gerald Schaefer
rder is cleared in destroy_compound_gigantic_page(), and compound_nr is set to 1U << order == 1 for order 0 in set_compound_order(page, 0). Fix this by explicitly clearing compound_nr for first tail page after calling set_compound_order(page, 0). Cc: # 5.9+ Fixes: 1378a5ee451a ("mm:

[RFC PATCH 0/1] "Bad page state" while freeing gigantic pages

2020-12-08 Thread Gerald Schaefer
ic pages using free_contig_range(). So a "page[1].mapping = NULL" might also be an option, instead of the "page[1].compound_nr = 0" in my patch, but that looks even more ugly, since it would clear more than needed. Gerald Schaefer (1): mm/hugetlb: clear compound_nr before freeing gigantic pages mm/hugetlb.c | 1 + 1 file changed, 1 insertion(+) -- 2.17.1

[PATCH] mm/userfaultfd: do not access vma->vm_mm after calling handle_userfault()

2020-11-10 Thread Gerald Schaefer
00962d6980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ 962d6a00: fb fb fc fc fc fc fc fc fc fc 00 00 00 00 00 00 962d6a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 == Fixes: 6b251fc96cf2

Re: [PATCH 08/13] s390/pci: Remove races against pte updates

2020-10-09 Thread Gerald Schaefer
c: linux...@kvack.org > Cc: linux-arm-ker...@lists.infradead.org > Cc: linux-samsung-...@vger.kernel.org > Cc: linux-me...@vger.kernel.org > Cc: Niklas Schnelle > Cc: Gerald Schaefer > Cc: linux-s...@vger.kernel.org > --- > arch/s390/pci/pci_mmio.c | 98 +++

Re: [PATCH 08/13] s390/pci: Remove races against pte updates

2020-10-08 Thread Gerald Schaefer
c: linux...@kvack.org > Cc: linux-arm-ker...@lists.infradead.org > Cc: linux-samsung-...@vger.kernel.org > Cc: linux-me...@vger.kernel.org > Cc: Niklas Schnelle > Cc: Gerald Schaefer > Cc: linux-s...@vger.kernel.org > --- > arch/s390/pci/pci_mmio.c | 98 +++

Re: BUG: Bad page state in process dirtyc0w_child

2020-09-24 Thread Gerald Schaefer
On Thu, 24 Sep 2020 00:02:26 +0200 Gerald Schaefer wrote: > On Wed, 23 Sep 2020 14:50:36 -0700 > Linus Torvalds wrote: > > > On Wed, Sep 23, 2020 at 2:33 PM Gerald Schaefer > > wrote: > > > > > > Thanks, very nice walk-through, need some time

Re: BUG: Bad page state in process dirtyc0w_child

2020-09-23 Thread Gerald Schaefer
On Wed, 23 Sep 2020 14:50:36 -0700 Linus Torvalds wrote: > On Wed, Sep 23, 2020 at 2:33 PM Gerald Schaefer > wrote: > > > > Thanks, very nice walk-through, need some time to digest this. The TLB > > aspect is interesting, and we do have our own __tlb_remove_page_size(),

Re: BUG: Bad page state in process dirtyc0w_child

2020-09-23 Thread Gerald Schaefer
On Wed, 23 Sep 2020 13:00:45 -0700 Linus Torvalds wrote: [...] > > Ooh. One thing that is *very* different about s390 is that it frees > the page directly, and doesn't batch things up to happen after the TLB > flush. > > Maybe THAT is the difference? Not that I can tell why it should > matter,

Re: BUG: Bad page state in process dirtyc0w_child

2020-09-23 Thread Gerald Schaefer
On Tue, 22 Sep 2020 19:03:50 +0200 Gerald Schaefer wrote: > On Wed, 16 Sep 2020 16:28:06 +0200 > Heiko Carstens wrote: > > > On Sat, Sep 12, 2020 at 09:54:12PM -0400, Qian Cai wrote: > > > Occasionally, running this LTP test will trigger an error below on > > &g

Re: BUG: Bad page state in process dirtyc0w_child

2020-09-22 Thread Gerald Schaefer
On Wed, 16 Sep 2020 16:28:06 +0200 Heiko Carstens wrote: > On Sat, Sep 12, 2020 at 09:54:12PM -0400, Qian Cai wrote: > > Occasionally, running this LTP test will trigger an error below on > > s390: > > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/security/dirtyc0w/dirtyc

Re: Ways to deprecate /sys/devices/system/memory/memoryX/phys_device ?

2020-09-22 Thread Gerald Schaefer
On Thu, 10 Sep 2020 12:20:34 +0200 David Hildenbrand wrote: > Hi everybody, > > I was just exploring how /sys/devices/system/memory/memoryX/phys_device > is/was used. It's one of these interfaces that most probably never > should have been added but now we are stuck with it. > > "phys_device" w

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-10 Thread Gerald Schaefer
On Thu, 10 Sep 2020 11:33:17 -0700 Linus Torvalds wrote: > On Thu, Sep 10, 2020 at 11:13 AM Jason Gunthorpe wrote: > > > > So.. To change away from the stack option I think we'd have to pass > > the READ_ONCE value to pXX_offset() as an extra argument instead of it > > derefing the pointer inter

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-10 Thread Gerald Schaefer
On Thu, 10 Sep 2020 11:33:17 -0700 Linus Torvalds wrote: > On Thu, Sep 10, 2020 at 11:13 AM Jason Gunthorpe wrote: > > > > So.. To change away from the stack option I think we'd have to pass > > the READ_ONCE value to pXX_offset() as an extra argument instead of it > > derefing the pointer inter

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-10 Thread Gerald Schaefer
On Thu, 10 Sep 2020 10:02:33 -0300 Jason Gunthorpe wrote: > On Thu, Sep 10, 2020 at 11:39:25AM +0200, Alexander Gordeev wrote: > > > As Gerald mentioned, it is very difficult to explain in a clear way. > > Hopefully, one could make sense ot of it. > > I would say the page table API requires thi

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-10 Thread Gerald Schaefer
On Thu, 10 Sep 2020 10:02:33 -0300 Jason Gunthorpe wrote: > On Thu, Sep 10, 2020 at 11:39:25AM +0200, Alexander Gordeev wrote: > > > As Gerald mentioned, it is very difficult to explain in a clear way. > > Hopefully, one could make sense ot of it. > > I would say the page table API requires thi

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-10 Thread Gerald Schaefer
On Thu, 10 Sep 2020 12:10:26 -0300 Jason Gunthorpe wrote: > On Thu, Sep 10, 2020 at 03:28:03PM +0200, Gerald Schaefer wrote: > > On Thu, 10 Sep 2020 10:02:33 -0300 > > Jason Gunthorpe wrote: > > > > > On Thu, Sep 10, 2020 at 11:39:25AM +0200, Alexander Gord

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-10 Thread Gerald Schaefer
On Thu, 10 Sep 2020 12:10:26 -0300 Jason Gunthorpe wrote: > On Thu, Sep 10, 2020 at 03:28:03PM +0200, Gerald Schaefer wrote: > > On Thu, 10 Sep 2020 10:02:33 -0300 > > Jason Gunthorpe wrote: > > > > > On Thu, Sep 10, 2020 at 11:39:25AM +0200, Alexander Gord

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-10 Thread Gerald Schaefer
On Thu, 10 Sep 2020 10:02:33 -0300 Jason Gunthorpe wrote: > On Thu, Sep 10, 2020 at 11:39:25AM +0200, Alexander Gordeev wrote: > > > As Gerald mentioned, it is very difficult to explain in a clear way. > > Hopefully, one could make sense ot of it. > > I would say the page table API requires t

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-10 Thread Gerald Schaefer
On Thu, 10 Sep 2020 10:02:33 -0300 Jason Gunthorpe wrote: > On Thu, Sep 10, 2020 at 11:39:25AM +0200, Alexander Gordeev wrote: > > > As Gerald mentioned, it is very difficult to explain in a clear way. > > Hopefully, one could make sense ot of it. > > I would say the page table API requires t

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-10 Thread Gerald Schaefer
On Wed, 9 Sep 2020 15:03:24 -0300 Jason Gunthorpe wrote: > On Wed, Sep 09, 2020 at 07:25:34PM +0200, Gerald Schaefer wrote: > > I actually had to draw myself a picture to get some hold of > > this, or rather a walk-through with a certain pud-crossing > > range in a folded

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-10 Thread Gerald Schaefer
On Wed, 9 Sep 2020 15:03:24 -0300 Jason Gunthorpe wrote: > On Wed, Sep 09, 2020 at 07:25:34PM +0200, Gerald Schaefer wrote: > > I actually had to draw myself a picture to get some hold of > > this, or rather a walk-through with a certain pud-crossing > > range in a folded

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-09 Thread Gerald Schaefer
On Wed, 9 Sep 2020 09:18:46 -0700 Dave Hansen wrote: > On 9/9/20 5:29 AM, Gerald Schaefer wrote: > > This only works well as long there are real pagetable pointers involved, > > that can also be used for iteration. For gup_fast, or any other future > > pagetable walkers usin

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-09 Thread Gerald Schaefer
On Wed, 9 Sep 2020 09:18:46 -0700 Dave Hansen wrote: > On 9/9/20 5:29 AM, Gerald Schaefer wrote: > > This only works well as long there are real pagetable pointers involved, > > that can also be used for iteration. For gup_fast, or any other future > > pagetable walkers usin

Re: [RFC PATCH v2 0/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-09 Thread Gerald Schaefer
On Tue, 8 Sep 2020 19:36:50 +0200 Gerald Schaefer wrote: [..] > > It seems now that the generalization is very well accepted so far, > apart from some apparent issues on arm. Also, merging 2 + 3 and > putting them first seems to be acceptable, so we could do that for > v3,

Re: [RFC PATCH v2 0/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-09 Thread Gerald Schaefer
On Tue, 8 Sep 2020 19:36:50 +0200 Gerald Schaefer wrote: [..] > > It seems now that the generalization is very well accepted so far, > apart from some apparent issues on arm. Also, merging 2 + 3 and > putting them first seems to be acceptable, so we could do that for > v3,

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-09 Thread Gerald Schaefer
On Tue, 8 Sep 2020 07:30:50 -0700 Dave Hansen wrote: > On 9/7/20 11:00 AM, Gerald Schaefer wrote: > > Commit 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast > > code") introduced a subtle but severe bug on s390 with gup_fast, due to > > dynamic

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-09 Thread Gerald Schaefer
On Tue, 8 Sep 2020 07:30:50 -0700 Dave Hansen wrote: > On 9/7/20 11:00 AM, Gerald Schaefer wrote: > > Commit 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast > > code") introduced a subtle but severe bug on s390 with gup_fast, due to > > dynamic

Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes

2020-09-09 Thread Gerald Schaefer
On Wed, 9 Sep 2020 13:38:25 +0530 Anshuman Khandual wrote: > > > On 09/04/2020 08:56 PM, Gerald Schaefer wrote: > > On Fri, 4 Sep 2020 12:18:05 +0530 > > Anshuman Khandual wrote: > > > >> > >> > >> On 09/02/2020 05:12 PM, Aneesh Kumar

Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes

2020-09-09 Thread Gerald Schaefer
On Wed, 9 Sep 2020 13:38:25 +0530 Anshuman Khandual wrote: > > > On 09/04/2020 08:56 PM, Gerald Schaefer wrote: > > On Fri, 4 Sep 2020 12:18:05 +0530 > > Anshuman Khandual wrote: > > > >> > >> > >> On 09/02/2020 05:12 PM, Aneesh Kumar

Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes

2020-09-09 Thread Gerald Schaefer
On Wed, 09 Sep 2020 11:38:39 +0530 "Aneesh Kumar K.V" wrote: > Gerald Schaefer writes: > > > On Fri, 4 Sep 2020 18:01:15 +0200 > > Gerald Schaefer wrote: > > > > [...] > >> > >> BTW2, a quick test with this change (so far) made the

Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes

2020-09-09 Thread Gerald Schaefer
On Wed, 09 Sep 2020 11:38:39 +0530 "Aneesh Kumar K.V" wrote: > Gerald Schaefer writes: > > > On Fri, 4 Sep 2020 18:01:15 +0200 > > Gerald Schaefer wrote: > > > > [...] > >> > >> BTW2, a quick test with this change (so far) made the

Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes

2020-09-09 Thread Gerald Schaefer
On Wed, 9 Sep 2020 13:45:48 +0530 Anshuman Khandual wrote: [...] > > > > That would more match the "pte_t pointer" usage for hugetlb code, > > i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned, > > but I think the root cause is the pte_t pointer. > > Ideally, the pte_t pointer u

Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes

2020-09-09 Thread Gerald Schaefer
On Wed, 9 Sep 2020 13:45:48 +0530 Anshuman Khandual wrote: [...] > > > > That would more match the "pte_t pointer" usage for hugetlb code, > > i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned, > > but I think the root cause is the pte_t pointer. > > Ideally, the pte_t pointer u

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-08 Thread Gerald Schaefer
On Tue, 8 Sep 2020 14:40:10 +0200 Christophe Leroy wrote: > > > Le 08/09/2020 à 14:09, Christian Borntraeger a écrit : > > > > > > On 08.09.20 07:06, Christophe Leroy wrote: > >> > >> > >> Le 07/09/2020 à 20:00, Gerald Schaefer a écr

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-08 Thread Gerald Schaefer
On Tue, 8 Sep 2020 07:30:50 -0700 Dave Hansen wrote: > On 9/7/20 11:00 AM, Gerald Schaefer wrote: > > Commit 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast > > code") introduced a subtle but severe bug on s390 with gup_fast, due to > > dynamic

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-08 Thread Gerald Schaefer
On Tue, 8 Sep 2020 07:30:50 -0700 Dave Hansen wrote: > On 9/7/20 11:00 AM, Gerald Schaefer wrote: > > Commit 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast > > code") introduced a subtle but severe bug on s390 with gup_fast, due to > > dynamic

Re: [RFC PATCH v2 0/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-08 Thread Gerald Schaefer
On Tue, 8 Sep 2020 07:22:39 +0200 Christophe Leroy wrote: > > > Le 07/09/2020 à 22:12, Mike Rapoport a écrit : > > On Mon, Sep 07, 2020 at 08:00:55PM +0200, Gerald Schaefer wrote: > >> This is v2 of an RFC previously discussed here: > >> https://lore.kern

Re: [RFC PATCH v2 0/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-08 Thread Gerald Schaefer
On Tue, 8 Sep 2020 07:22:39 +0200 Christophe Leroy wrote: > > > Le 07/09/2020 à 22:12, Mike Rapoport a écrit : > > On Mon, Sep 07, 2020 at 08:00:55PM +0200, Gerald Schaefer wrote: > >> This is v2 of an RFC previously discussed here: > >> https://lore.kern

Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes

2020-09-08 Thread Gerald Schaefer
On Fri, 4 Sep 2020 18:01:15 +0200 Gerald Schaefer wrote: [...] > > BTW2, a quick test with this change (so far) made the issues on s390 > go away: > > @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void) > spin_unlock(ptl); > > #ifn

Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes

2020-09-08 Thread Gerald Schaefer
On Fri, 4 Sep 2020 18:01:15 +0200 Gerald Schaefer wrote: [...] > > BTW2, a quick test with this change (so far) made the issues on s390 > go away: > > @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void) > spin_unlock(ptl); > > #ifn

Re: [RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-08 Thread Gerald Schaefer
On Tue, 8 Sep 2020 14:40:10 +0200 Christophe Leroy wrote: > > > Le 08/09/2020 à 14:09, Christian Borntraeger a écrit : > > > > > > On 08.09.20 07:06, Christophe Leroy wrote: > >> > >> > >> Le 07/09/2020 à 20:00, Gerald Schaefer a écr

[RFC PATCH v2 0/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-07 Thread Gerald Schaefer
This is v2 of an RFC previously discussed here: https://lore.kernel.org/lkml/20200828140314.8556-1-gerald.schae...@linux.ibm.com/ Patch 1 is a fix for a regression in gup_fast on s390, after our conversion to common gup_fast code. It will introduce special helper functions pXd_addr_end_folded(), w

[RFC PATCH v2 2/3] mm: make pXd_addr_end() functions page-table entry aware

2020-09-07 Thread Gerald Schaefer
Signed-off-by: Gerald Schaefer --- arch/arm/include/asm/pgtable-2level.h| 2 +- arch/arm/mm/idmap.c | 6 ++-- arch/arm/mm/mmu.c| 8 ++--- arch/arm64/kernel/hibernate.c| 16 ++ arch/arm64/kvm/mmu.c | 16

[RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-07 Thread Gerald Schaefer
change for other architectures introduced. Fixes: 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code") Cc: # 5.2+ Reviewed-by: Gerald Schaefer Signed-off-by: Alexander Gordeev Signed-off-by: Gerald Schaefer --- arch/s390/include/asm/pgtable.h | 4

[RFC PATCH v2 3/3] mm: make generic pXd_addr_end() macros inline functions

2020-09-07 Thread Gerald Schaefer
subtle bugs. Signed-off-by: Alexander Gordeev Signed-off-by: Gerald Schaefer --- include/linux/pgtable.h | 36 1 file changed, 20 insertions(+), 16 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 67ebc22cf83d..d9e7d16c2263

[RFC PATCH v2 2/3] mm: make pXd_addr_end() functions page-table entry aware

2020-09-07 Thread Gerald Schaefer
Signed-off-by: Gerald Schaefer --- arch/arm/include/asm/pgtable-2level.h| 2 +- arch/arm/mm/idmap.c | 6 ++-- arch/arm/mm/mmu.c| 8 ++--- arch/arm64/kernel/hibernate.c| 16 ++ arch/arm64/kvm/mmu.c | 16

[RFC PATCH v2 0/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-07 Thread Gerald Schaefer
This is v2 of an RFC previously discussed here: https://lore.kernel.org/lkml/20200828140314.8556-1-gerald.schae...@linux.ibm.com/ Patch 1 is a fix for a regression in gup_fast on s390, after our conversion to common gup_fast code. It will introduce special helper functions pXd_addr_end_folded(), w

[RFC PATCH v2 1/3] mm/gup: fix gup_fast with dynamic page table folding

2020-09-07 Thread Gerald Schaefer
change for other architectures introduced. Fixes: 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code") Cc: # 5.2+ Reviewed-by: Gerald Schaefer Signed-off-by: Alexander Gordeev Signed-off-by: Gerald Schaefer --- arch/s390/include/asm/pgtable.h | 4

  1   2   3   4   5   >