Re: [PATCH v4 08/14] mm/gup: grab head page refcount once for group of subpages

2021-08-27 Thread Joao Martins
On 8/27/21 5:25 PM, Jason Gunthorpe wrote: > On Fri, Aug 27, 2021 at 03:58:13PM +0100, Joao Martins wrote: > >> #if defined(CONFIG_ARCH_HAS_PTE_DEVMAP) && >> defined(CONFIG_TRANSPARENT_HUGEPAGE) >> static int __gup_device_huge(unsigned long pfn, unsigned long addr, >>

Re: [PATCH v4 08/14] mm/gup: grab head page refcount once for group of subpages

2021-08-27 Thread Jason Gunthorpe
On Fri, Aug 27, 2021 at 03:58:13PM +0100, Joao Martins wrote: > #if defined(CONFIG_ARCH_HAS_PTE_DEVMAP) && > defined(CONFIG_TRANSPARENT_HUGEPAGE) > static int __gup_device_huge(unsigned long pfn, unsigned long addr, >unsigned long end, unsigned int flags, >

Re: [PATCH v4 04/14] mm/memremap: add ZONE_DEVICE support for compound pages

2021-08-27 Thread Joao Martins
On 8/27/21 4:33 PM, Christoph Hellwig wrote: > On Fri, Aug 27, 2021 at 03:58:09PM +0100, Joao Martins wrote: >> + * @geometry: structural definition of how the vmemmap metadata is >> populated. >> + * A zero or 1 defaults to using base pages as the memmap metadata >> + * representation. A bigger

Re: [PATCH v4 04/14] mm/memremap: add ZONE_DEVICE support for compound pages

2021-08-27 Thread Christoph Hellwig
On Fri, Aug 27, 2021 at 03:58:09PM +0100, Joao Martins wrote: > + * @geometry: structural definition of how the vmemmap metadata is populated. > + * A zero or 1 defaults to using base pages as the memmap metadata > + * representation. A bigger value will set up compound struct pages > + * rep

[PATCH v4 06/14] device-dax: ensure dev_dax->pgmap is valid for dynamic devices

2021-08-27 Thread Joao Martins
Right now, only static dax regions have a valid @pgmap pointer in its struct dev_dax. Dynamic dax case however, do not. In preparation for device-dax compound devmap support, make sure that dev_dax pgmap field is set after it has been allocated and initialized. dynamic dax device have the @pgmap

[PATCH v4 14/14] mm/sparse-vmemmap: improve memory savings for compound pud geometry

2021-08-27 Thread Joao Martins
Currently, for compound PUD mappings, the implementation consumes 40MB per TB but it can be optimized to 16MB per TB with the approach detailed below. Right now basepages are used to populate the PUD tail pages, and it picks the address of the previous page of the subsection that precedes the memm

[PATCH v4 13/14] mm/page_alloc: reuse tail struct pages for compound devmaps

2021-08-27 Thread Joao Martins
Currently memmap_init_zone_device() ends up initializing 32768 pages when it only needs to initialize 128 given tail page reuse. That number is worse with 1GB compound page geometries, 262144 instead of 128. Update memmap_init_zone_device() to skip redundant initialization, detailed below. When a

[PATCH v4 12/14] mm/sparse-vmemmap: populate compound devmaps

2021-08-27 Thread Joao Martins
A compound devmap is a dev_pagemap with @geometry > PAGE_SIZE and it means that pages are mapped at a given huge page alignment and utilize uses compound pages as opposed to order-0 pages. Take advantage of the fact that most tail pages look the same (except the first two) to minimize struct page

[PATCH v4 11/14] mm/hugetlb_vmemmap: move comment block to Documentation/vm

2021-08-27 Thread Joao Martins
In preparation for device-dax for using hugetlbfs compound page tail deduplication technique, move the comment block explanation into a common place in Documentation/vm. Cc: Muchun Song Cc: Mike Kravetz Suggested-by: Dan Williams Signed-off-by: Joao Martins Reviewed-by: Muchun Song Reviewed-b

[PATCH v4 04/14] mm/memremap: add ZONE_DEVICE support for compound pages

2021-08-27 Thread Joao Martins
Add a new @geometry property for struct dev_pagemap which specifies that a devmap is composed of a set of compound pages of size @geometry, instead of base pages. When a compound page geometry is requested, all but the first page are initialised as tail pages instead of order-0 pages. For certain

[PATCH v4 05/14] device-dax: use ALIGN() for determining pgoff

2021-08-27 Thread Joao Martins
Rather than calculating @pgoff manually, switch to ALIGN() instead. Suggested-by: Dan Williams Signed-off-by: Joao Martins Reviewed-by: Dan Williams --- drivers/dax/device.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/dax/device.c b/drivers/dax/device.c inde

[PATCH v4 07/14] device-dax: compound devmap support

2021-08-27 Thread Joao Martins
Use the newly added compound devmap facility which maps the assigned dax ranges as compound pages at a page size of @align. Currently, this means, that region/namespace bootstrap would take considerably less, given that you would initialize considerably less pages. On setups with 128G NVDIMMs the

[PATCH v4 08/14] mm/gup: grab head page refcount once for group of subpages

2021-08-27 Thread Joao Martins
Use try_grab_compound_head() for device-dax GUP when configured with a compound devmap. Rather than incrementing the refcount for each page, do one atomic addition for all the pages to be pinned. Performance measured by gup_benchmark improves considerably get_user_pages_fast() and pin_user_pages_

[PATCH v4 00/14] mm, sparse-vmemmap: Introduce compound devmaps for device-dax

2021-08-27 Thread Joao Martins
dress it and that is also applicable to THP. But will submit that as a follow up of this. Patches apply on top of linux-next tag next-20210827 (commit 5e63226c7228). Comments and suggestions very much appreciated! Older Changelog, v2 -> v3[3]: * Collect Mike's Ack on patch 2 (Mike)

[PATCH v4 09/14] mm/sparse-vmemmap: add a pgmap argument to section activation

2021-08-27 Thread Joao Martins
In support of using compound pages for devmap mappings, plumb the pgmap down to the vmemmap_populate implementation. Note that while altmap is retrievable from pgmap the memory hotplug code passes altmap without pgmap[*], so both need to be independently plumbed. So in addition to @altmap, pass @p

[PATCH v4 03/14] mm/page_alloc: refactor memmap_init_zone_device() page init

2021-08-27 Thread Joao Martins
Move struct page init to an helper function __init_zone_device_page(). This is in preparation for sharing the storage for / deduplicating compound page metadata. Signed-off-by: Joao Martins Reviewed-by: Dan Williams --- mm/page_alloc.c | 74 +++-- 1

[PATCH v4 10/14] mm/sparse-vmemmap: refactor core of vmemmap_populate_basepages() to helper

2021-08-27 Thread Joao Martins
In preparation for describing a memmap with compound pages, move the actual pte population logic into a separate function vmemmap_populate_address() and have vmemmap_populate_basepages() walk through all base pages it needs to populate. Signed-off-by: Joao Martins --- mm/sparse-vmemmap.c | 51 ++

[PATCH v4 02/14] mm/page_alloc: split prep_compound_page into head and tail subparts

2021-08-27 Thread Joao Martins
Split the utility function prep_compound_page() into head and tail counterparts, and use them accordingly. This is in preparation for sharing the storage for / deduplicating compound page metadata. Signed-off-by: Joao Martins Acked-by: Mike Kravetz Reviewed-by: Dan Williams Reviewed-by: Muchun

[PATCH v4 01/14] memory-failure: fetch compound_head after pgmap_pfn_valid()

2021-08-27 Thread Joao Martins
memory_failure_dev_pagemap() at the moment assumes base pages (e.g. dax_lock_page()). For devmap with compound pages fetch the compound_head in case a tail page memory failure is being handled. Currently this is a nop, but in the advent of compound pages in dev_pagemap it allows memory_failure_de