Re: [PATCH v4 24/26] arch_numa: switch over to numa_memblks

2024-08-06 Thread Arnd Bergmann
On Wed, Aug 7, 2024, at 08:41, Mike Rapoport wrote: > From: "Mike Rapoport (Microsoft)" > > Until now arch_numa was directly translating firmware NUMA information > to memblock. I get a link time warning from this: WARNING: modpost: vmlinux: section mismatch in reference: numa_set_cpumask+0

[PATCH v4 26/26] docs: move numa=fake description to kernel-parameters.txt

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" NUMA emulation can be now enabled on arm64 and riscv in addition to x86. Move description of numa=fake parameters from x86 documentation of admin-guide/kernel-parameters.txt Suggested-by: Zi Yan Signed-off-by: Mike Rapoport (Microsoft) Reviewed-by: Jonathan C

[PATCH v4 25/26] mm: make range-to-target_node lookup facility a part of numa_memblks

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" The x86 implementation of range-to-target_node lookup (i.e. phys_to_target_node() and memory_add_physaddr_to_nid()) relies on numa_memblks. Since numa_memblks are now part of the generic code, move these functions from x86 to mm/numa_memblks.c and select CONFIG_

[PATCH v4 24/26] arch_numa: switch over to numa_memblks

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" Until now arch_numa was directly translating firmware NUMA information to memblock. Using numa_memblks as an intermediate step has a few advantages: * alignment with more battle tested x86 implementation * availability of NUMA emulation * maintaining node inform

[PATCH v4 23/26] of, numa: return -EINVAL when no numa-node-id is found

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" Currently of_numa_parse_memory_nodes() returns 0 if no "memory" node in device tree contains "numa-node-id" property. This makes of_numa_init() to return "success" despite no NUMA nodes were actually parsed and set up. arch_numa workarounds this by returning an

[PATCH v4 22/26] mm: numa_memblks: use memblock_{start,end}_of_DRAM() when sanitizing meminfo

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" numa_cleanup_meminfo() moves blocks outside system RAM to numa_reserved_meminfo and it uses 0 and PFN_PHYS(max_pfn) to determine the memory boundaries. Replace the memory range boundaries with more portable memblock_start_of_DRAM() and memblock_end_of_DRAM(). S

[PATCH v4 21/26] mm: numa_memblks: make several functions and variables static

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" Make functions and variables that are exclusively used by numa_memblks static. Move numa_nodemask_from_meminfo() before its callers to avoid forward declaration. Signed-off-by: Mike Rapoport (Microsoft) Tested-by: Zi Yan # for x86_64 and arm64 Reviewed-by: Jo

[PATCH v4 20/26] mm: numa_memblks: introduce numa_memblks_init

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" Move most of x86::numa_init() to numa_memblks so that the latter will be more self-contained. With this numa_memblk data structures should not be exposed to the architecture specific code. Signed-off-by: Mike Rapoport (Microsoft) Tested-by: Zi Yan # for x86_6

[PATCH v4 19/26] mm: introduce numa_emulation

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" Move numa_emulation code from arch/x86 to mm/numa_emulation.c This code will be later reused by arch_numa. No functional changes. Signed-off-by: Mike Rapoport (Microsoft) Tested-by: Zi Yan # for x86_64 and arm64 Reviewed-by: Jonathan Cameron Tested-by: Jona

[PATCH v4 18/26] mm: move numa_distance and related code from x86 to numa_memblks

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" Move code dealing with numa_distance array from arch/x86 to mm/numa_memblks.c This code will be later reused by arch_numa. No functional changes. Signed-off-by: Mike Rapoport (Microsoft) Tested-by: Zi Yan # for x86_64 and arm64 Reviewed-by: Jonathan Cameron

[PATCH v4 17/26] mm: introduce numa_memblks

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" Move code dealing with numa_memblks from arch/x86 to mm/ and add Kconfig options to let x86 select it in its Kconfig. This code will be later reused by arch_numa. No functional changes. Signed-off-by: Mike Rapoport (Microsoft) Tested-by: Zi Yan # for x86_64

[PATCH v4 16/26] x86/numa: numa_{add,remove}_cpu: make cpu parameter unsigned

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" CPU id cannot be negative. Making it unsigned also aligns with declarations in include/asm-generic/numa.h used by arm64 and riscv and allows sharing numa emulation code with these architectures. Signed-off-by: Mike Rapoport (Microsoft) Reviewed-by: Jonathan Ca

[PATCH v4 15/26] x86/numa_emu: use a helper function to get MAX_DMA32_PFN

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" This is required to make numa emulation code architecture independent so that it can be moved to generic code in following commits. Signed-off-by: Mike Rapoport (Microsoft) Reviewed-by: Jonathan Cameron Tested-by: Zi Yan # for x86_64 and arm64 Tested-by: Jona

[PATCH v4 14/26] x86/numa_emu: split __apicid_to_node update to a helper function

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" This is required to make numa emulation code architecture independent so that it can be moved to generic code in following commits. Signed-off-by: Mike Rapoport (Microsoft) Reviewed-by: Jonathan Cameron Tested-by: Zi Yan # for x86_64 and arm64 Tested-by: Jona

[PATCH v4 13/26] x86/numa_emu: simplify allocation of phys_dist

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" By the time numa_emulation() is called, all physical memory is already mapped in the direct map and there is no need to define limits for memblock allocation. Replace memblock_phys_alloc_range() with memblock_alloc(). Signed-off-by: Mike Rapoport (Microsoft) R

[PATCH v4 12/26] x86/numa: move FAKE_NODE_* defines to numa_emu

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" The definitions of FAKE_NODE_MIN_SIZE and FAKE_NODE_MIN_HASH_MASK are only used by numa emulation code, make them local to arch/x86/mm/numa_emulation.c Signed-off-by: Mike Rapoport (Microsoft) Reviewed-by: Jonathan Cameron Tested-by: Zi Yan # for x86_64 and a

[PATCH v4 11/26] x86/numa: use get_pfn_range_for_nid to verify that node spans memory

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" Instead of looping over numa_meminfo array to detect node's start and end addresses use get_pfn_range_for_init(). This is shorter and make it easier to lift numa_memblks to generic code. Signed-off-by: Mike Rapoport (Microsoft) Tested-by: Zi Yan # for x86_64

[PATCH v4 10/26] x86/numa: simplify numa_distance allocation

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" Allocation of numa_distance uses memblock_phys_alloc_range() to limit allocation to be below the last mapped page. But NUMA initializaition runs after the direct map is populated and there is also code in setup_arch() that adjusts memblock limit to reflect how m

[PATCH v4 09/26] arch, mm: pull out allocation of NODE_DATA to generic code

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" Architectures that support NUMA duplicate the code that allocates NODE_DATA on the node-local memory with slight variations in reporting of the addresses where the memory was allocated. Use x86 version as the basis for the generic alloc_node_data() function and

[PATCH v4 08/26] mm: drop CONFIG_HAVE_ARCH_NODEDATA_EXTENSION

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" There are no users of HAVE_ARCH_NODEDATA_EXTENSION left, so arch_alloc_nodedata() and arch_refresh_nodedata() are not needed anymore. Replace the call to arch_alloc_nodedata() in free_area_init() with a new helper alloc_offline_node_data(), remove arch_refresh_n

[PATCH v4 07/26] arch, mm: move definition of node_data to generic code

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" Every architecture that supports NUMA defines node_data in the same way: struct pglist_data *node_data[MAX_NUMNODES]; No reason to keep multiple copies of this definition and its forward declarations, especially when such forward declaration is the only

[PATCH v4 06/26] MIPS: loongson64: drop HAVE_ARCH_NODEDATA_EXTENSION

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" Commit f8f9f21c7848 ("MIPS: Fix build error for loongson64 and sgi-ip27") added HAVE_ARCH_NODEDATA_EXTENSION to loongson64 to silence a compilation error that happened because loongson64 didn't define array of pg_data_t as node_data like most other architectures

[PATCH v4 05/26] MIPS: loongson64: rename __node_data to node_data

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" Make definition of node_data match other architectures. This will allow pulling declaration of node_data to the generic mm code in the following commit. Signed-off-by: Mike Rapoport (Microsoft) Reviewed-by: Jiaxun Yang Reviewed-by: David Hildenbrand Reviewed-

[PATCH v4 04/26] MIPS: sgi-ip27: drop HAVE_ARCH_NODEDATA_EXTENSION

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" Commit f8f9f21c7848 ("MIPS: Fix build error for loongson64 and sgi-ip27") added HAVE_ARCH_NODEDATA_EXTENSION to sgi-ip27 to silence a compilation error that happened because sgi-ip27 didn't define array of pg_data_t as node_data like most other architectures did.

[PATCH v4 03/26] MIPS: sgi-ip27: ensure node_possible_map only contains valid nodes

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" For SGI IP27 machines node_possible_map is statically set to NODE_MASK_ALL and it is not updated during NUMA initialization. Ensure that it only contains nodes present in the system. Signed-off-by: Mike Rapoport (Microsoft) Reviewed-by: Jonathan Cameron Teste

[PATCH v4 02/26] MIPS: sgi-ip27: make NODE_DATA() the same as on all other architectures

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" sgi-ip27 is the only system that defines NODE_DATA() differently than the rest of NUMA machines. Add node_data array of struct pglist pointers that will point to __node_data[node]->pglist and redefine NODE_DATA() to use node_data array. This will allow pulling

[PATCH v4 01/26] mm: move kernel/numa.c to mm/

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" The stub functions in kernel/numa.c belong to mm/ rather than to kernel/ Signed-off-by: Mike Rapoport (Microsoft) Acked-by: David Hildenbrand Reviewed-by: Jonathan Cameron Tested-by: Zi Yan # for x86_64 and arm64 Tested-by: Jonathan Cameron [arm64 + CXL via

[PATCH v4 00/26] mm: introduce numa_memblks

2024-08-06 Thread Mike Rapoport
From: "Mike Rapoport (Microsoft)" Hi, Following the discussion about handling of CXL fixed memory windows on arm64 [1] I decided to bite the bullet and move numa_memblks from x86 to the generic code so they will be available on arm64/riscv and maybe on loongarch sometime later. While it could b

Re: [PATCH v13] mm: report per-page metadata information

2024-08-06 Thread Pasha Tatashin
On Tue, Aug 6, 2024 at 5:37 PM Pasha Tatashin wrote: > > On Tue, Aug 6, 2024 at 4:53 PM Ira Weiny wrote: > > > > On Tue, Aug 06, 2024 at 01:59:54PM -0400, Pasha Tatashin wrote: > > > On Mon, Aug 5, 2024 at 7:06 PM Dan Williams > > > wrote: > > > > > > > > Pasha Tatashin wrote: > > > > [..] > >

Re: [PATCH v13] mm: report per-page metadata information

2024-08-06 Thread Pasha Tatashin
On Tue, Aug 6, 2024 at 4:53 PM Ira Weiny wrote: > > On Tue, Aug 06, 2024 at 01:59:54PM -0400, Pasha Tatashin wrote: > > On Mon, Aug 5, 2024 at 7:06 PM Dan Williams > > wrote: > > > > > > Pasha Tatashin wrote: > > > [..] > > > > Thank you for the heads up. Can you please attach a full config file

Re: [PATCH v13] mm: report per-page metadata information

2024-08-06 Thread Pasha Tatashin
On Mon, Aug 5, 2024 at 7:06 PM Dan Williams wrote: > > Pasha Tatashin wrote: > [..] > > Thank you for the heads up. Can you please attach a full config file, > > also was anyone able to reproduce this problem in qemu with emulated > > nvdimm? > > Yes, I can reproduce the crash just by trying to re

Re: [PATCH v3 24/26] arch_numa: switch over to numa_memblks

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" Until now arch_numa was directly translating firmware NUMA information to memblock. Using numa_memblks as an intermediate step has a few advantages: * alignment with more battle tested x86 implementation * availability o

Re: [PATCH v3 25/26] mm: make range-to-target_node lookup facility a part of numa_memblks

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" The x86 implementation of range-to-target_node lookup (i.e. phys_to_target_node() and memory_add_physaddr_to_nid()) relies on numa_memblks. Since numa_memblks are now part of the generic code, move these functions from x

Re: [PATCH v3 26/26] docs: move numa=fake description to kernel-parameters.txt

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" NUMA emulation can be now enabled on arm64 and riscv in addition to x86. Move description of numa=fake parameters from x86 documentation of admin-guide/kernel-parameters.txt Suggested-by: Zi Yan Signed-off-by: Mike Rap

Re: [PATCH v3 23/26] of, numa: return -EINVAL when no numa-node-id is found

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" Currently of_numa_parse_memory_nodes() returns 0 if no "memory" node in device tree contains "numa-node-id" property. This makes of_numa_init() to return "success" despite no NUMA nodes were actually parsed and set up. a

Re: [PATCH v3 22/26] mm: numa_memblks: use memblock_{start,end}_of_DRAM() when sanitizing meminfo

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" numa_cleanup_meminfo() moves blocks outside system RAM to numa_reserved_meminfo and it uses 0 and PFN_PHYS(max_pfn) to determine the memory boundaries. Replace the memory range boundaries with more portable memblock_star

Re: [PATCH v3 19/26] mm: introduce numa_emulation

2024-08-06 Thread David Hildenbrand
On 06.08.24 15:20, David Hildenbrand wrote: On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" Move numa_emulation codfrom arch/x86 to mm/numa_emulation.c This code will be later reused by arch_numa. I'm confused why documentation lists for "numa=fake=" [KNL, ARM64,

Re: [PATCH v3 21/26] mm: numa_memblks: make several functions and variables static

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" Make functions and variables that are exclusively used by numa_memblks static. Move numa_nodemask_from_meminfo() before its callers to avoid forward declaration. Signed-off-by: Mike Rapoport (Microsoft) Tested-by: Zi Y

Re: [PATCH v3 20/26] mm: numa_memblks: introduce numa_memblks_init

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" Move most of x86::numa_init() to numa_memblks so that the latter will be more self-contained. With this numa_memblk data structures should not be exposed to the architecture specific code. Signed-off-by: Mike Rapoport (

Re: [PATCH v3 19/26] mm: introduce numa_emulation

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" Move numa_emulation codfrom arch/x86 to mm/numa_emulation.c This code will be later reused by arch_numa. I'm confused why documentation lists for "numa=fake=" [KNL, ARM64, RISCV, X86, EARLY] -- Cheers, David / dhil

Re: [PATCH v3 19/26] mm: introduce numa_emulation

2024-08-06 Thread David Hildenbrand
On 05.08.24 22:09, Dan Williams wrote: Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" Move numa_emulation codfrom arch/x86 to mm/numa_emulation.c s/codfrom/code from/ I am surprised that numa-emulation stayed x86 only for so long. I think it is useful facility for debugging NUMA sca

Re: [PATCH v3 18/26] mm: move numa_distance and related code from x86 to numa_memblks

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" Move code dealing with numa_distance array from arch/x86 to mm/numa_memblks.c This code will be later reused by arch_numa. No functional changes. Signed-off-by: Mike Rapoport (Microsoft) Tested-by: Zi Yan # for x86_6

Re: [PATCH v3 17/26] mm: introduce numa_memblks

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" Move code dealing with numa_memblks from arch/x86 to mm/ and add Kconfig options to let x86 select it in its Kconfig. This code will be later reused by arch_numa. No functional changes. Signed-off-by: Mike Rapoport (Mi

Re: [PATCH v3 16/26] x86/numa: numa_{add,remove}_cpu: make cpu parameter unsigned

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" CPU id cannot be negative. Making it unsigned also aligns with declarations in include/asm-generic/numa.h used by arm64 and riscv and allows sharing numa emulation code with these architectures. Signed-off-by: Mike Rapo

Re: [PATCH v3 15/26] x86/numa_emu: use a helper function to get MAX_DMA32_PFN

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" This is required to make numa emulation code architecture independent so that it can be moved to generic code in following commits. Signed-off-by: Mike Rapoport (Microsoft) Reviewed-by: Jonathan Cameron Tested-by: Zi Y

Re: [PATCH v3 14/26] x86/numa_emu: split __apicid_to_node update to a helper function

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" This is required to make numa emulation code architecture independent so that it can be moved to generic code in following commits. Signed-off-by: Mike Rapoport (Microsoft) Reviewed-by: Jonathan Cameron Tested-by: Zi Y

Re: [PATCH v3 13/26] x86/numa_emu: simplify allocation of phys_dist

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" By the time numa_emulation() is called, all physical memory is already mapped in the direct map and there is no need to define limits for memblock allocation. Replace memblock_phys_alloc_range() with memblock_alloc(). S

Re: [PATCH v3 12/26] x86/numa: move FAKE_NODE_* defines to numa_emu

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" The definitions of FAKE_NODE_MIN_SIZE and FAKE_NODE_MIN_HASH_MASK are only used by numa emulation code, make them local to arch/x86/mm/numa_emulation.c Signed-off-by: Mike Rapoport (Microsoft) Reviewed-by: Jonathan Came

Re: [PATCH v3 11/26] x86/numa: use get_pfn_range_for_nid to verify that node spans memory

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" Instead of looping over numa_meminfo array to detect node's start and end addresses use get_pfn_range_for_init(). This is shorter and make it easier to lift numa_memblks to generic code. Signed-off-by: Mike Rapoport (Mi

Re: [PATCH v3 10/26] x86/numa: simplify numa_distance allocation

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" Allocation of numa_distance uses memblock_phys_alloc_range() to limit allocation to be below the last mapped page. But NUMA initializaition runs after the direct map is populated and there is also code in setup_arch() th

Re: [PATCH v3 06/26] MIPS: loongson64: drop HAVE_ARCH_NODEDATA_EXTENSION

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" Commit f8f9f21c7848 ("MIPS: Fix build error for loongson64 and sgi-ip27") added HAVE_ARCH_NODEDATA_EXTENSION to loongson64 to silence a compilation error that happened because loongson64 didn't define array of pg_data_t a

Re: [PATCH v3 04/26] MIPS: sgi-ip27: drop HAVE_ARCH_NODEDATA_EXTENSION

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" Commit f8f9f21c7848 ("MIPS: Fix build error for loongson64 and sgi-ip27") added HAVE_ARCH_NODEDATA_EXTENSION to sgi-ip27 to silence a compilation error that happened because sgi-ip27 didn't define array of pg_data_t as no

Re: [PATCH v3 03/26] MIPS: sgi-ip27: ensure node_possible_map only contains valid nodes

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" For SGI IP27 machines node_possible_map is statically set to NODE_MASK_ALL and it is not updated during NUMA initialization. Ensure that it only contains nodes present in the system. Signed-off-by: Mike Rapoport (Micros

Re: [PATCH v3 02/26] MIPS: sgi-ip27: make NODE_DATA() the same as on all other architectures

2024-08-06 Thread David Hildenbrand
On 01.08.24 08:08, Mike Rapoport wrote: From: "Mike Rapoport (Microsoft)" sgi-ip27 is the only system that defines NODE_DATA() differently than the rest of NUMA machines. Add node_data array of struct pglist pointers that will point to __node_data[node]->pglist and redefine NODE_DATA() to use