Dan Williams <dan.j.willi...@intel.com> writes: > The "sub-section memory hotplug" facility allows memremap_pages() users > like libnvdimm to compensate for hardware platforms like x86 that have a > section size larger than their hardware memory mapping granularity. The > compensation that sub-section support affords is being tolerant of > physical memory resources shifting by units smaller (64MiB on x86) than > the memory-hotplug section size (128 MiB). Where the platform > physical-memory mapping granularity is limited by the number and > capability of address-decode-registers in the memory controller. > > While the sub-section support allows memremap_pages() to operate on > sub-section (2MiB) granularity, the Power architecture may still > require 16MiB alignment on "!radix_enabled()" platforms. > > In order for libnvdimm to be able to detect and manage this per-arch > limitation, introduce memremap_compat_align() as a common minimum > alignment across all driver-facing memory-mapping interfaces, and let > Power override it to 16MiB in the "!radix_enabled()" case. > > The assumption / requirement for 16MiB to be a viable > memremap_compat_align() value is that Power does not have platforms > where its equivalent of address-decode-registers never hardware remaps a > persistent memory resource on smaller than 16MiB boundaries. Note that I > tried my best to not add a new Kconfig symbol, but header include > entanglements defeated the #ifndef memremap_compat_align design pattern > and the need to export it defeats the __weak design pattern for arch > overrides. > > Based on an initial patch by Aneesh. > > Link: > http://lore.kernel.org/r/capcyv4gbgnp95apyabcsocea50tqj9b5h__83vgngjq3oug...@mail.gmail.com > Reported-by: Aneesh Kumar K.V <aneesh.ku...@linux.ibm.com> > Reported-by: Jeff Moyer <jmo...@redhat.com> > Cc: Benjamin Herrenschmidt <b...@kernel.crashing.org> > Cc: Paul Mackerras <pau...@samba.org> > Reviewed-by: Aneesh Kumar K.V <aneesh.ku...@linux.ibm.com> > Signed-off-by: Dan Williams <dan.j.willi...@intel.com> > --- > arch/powerpc/Kconfig | 1 + > arch/powerpc/mm/ioremap.c | 21 +++++++++++++++++++++ > drivers/nvdimm/pfn_devs.c | 2 +- > include/linux/memremap.h | 8 ++++++++ > include/linux/mmzone.h | 1 + > lib/Kconfig | 3 +++ > mm/memremap.c | 23 +++++++++++++++++++++++ > 7 files changed, 58 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index 497b7d0b2d7e..e6ffe905e2b9 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -122,6 +122,7 @@ config PPC > select ARCH_HAS_GCOV_PROFILE_ALL > select ARCH_HAS_KCOV > select ARCH_HAS_HUGEPD if HUGETLB_PAGE > + select ARCH_HAS_MEMREMAP_COMPAT_ALIGN > select ARCH_HAS_MMIOWB if PPC64 > select ARCH_HAS_PHYS_TO_DMA > select ARCH_HAS_PMEM_API > diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c > index fc669643ce6a..b1a0aebe8c48 100644 > --- a/arch/powerpc/mm/ioremap.c > +++ b/arch/powerpc/mm/ioremap.c > @@ -2,6 +2,7 @@ > > #include <linux/io.h> > #include <linux/slab.h> > +#include <linux/mmzone.h> > #include <linux/vmalloc.h> > #include <asm/io-workarounds.h> > > @@ -97,3 +98,23 @@ void __iomem *do_ioremap(phys_addr_t pa, phys_addr_t > offset, unsigned long size, > > return NULL; > } > + > +#ifdef CONFIG_ZONE_DEVICE > +/* > + * Override the generic version in mm/memremap.c. > + * > + * With hash translation, the direct-map range is mapped with just one > + * page size selected by htab_init_page_sizes(). Consult > + * mmu_psize_defs[] to determine the minimum page size alignment. > +*/ > +unsigned long memremap_compat_align(void) > +{ > + unsigned int shift = mmu_psize_defs[mmu_linear_psize].shift; > + > + if (radix_enabled()) > + return SUBSECTION_SIZE; > + return max(SUBSECTION_SIZE, 1UL << shift); > + > +} > +EXPORT_SYMBOL_GPL(memremap_compat_align); > +#endif
LGTM. Acked-by: Michael Ellerman <m...@ellerman.id.au> (powerpc) cheers