On 6/4/25 7:00 PM, David Hildenbrand wrote:
On 04.06.25 15:17, Donet Tom wrote:
On 6/4/25 3:15 PM, David Hildenbrand wrote:
On 04.06.25 05:07, Andrew Morton wrote:
On Wed, 28 May 2025 12:18:00 -0500 Donet Tom <donet...@linux.ibm.com>
wrote:
During node device initialization, `memory blocks` are registered
under
each NUMA node. The `memory blocks` to be registered are identified
using
the node’s start and end PFNs, which are obtained from the node's
pg_data
It's quite unconventional to omit the [0/N] changelog. This omission
somewhat messed up my processes so I added a one-liner to this.
Yeah, I was assuming that I simply did not get cc'ed on the cover
letter, but there is actually none.
Donet please add that in the future. git can do this using
--cover-letter.
Sure,
I will add cover letter in next revision.
...
Test Results on My system with 32TB RAM
=======================================
1. Boot time with CONFIG_DEFERRED_STRUCT_PAGE_INIT enabled.
Without this patch
------------------
Startup finished in 1min 16.528s (kernel)
With this patch
---------------
Startup finished in 17.236s (kernel) - 78% Improvement
Well someone is in for a nice surprise.
2. Boot time with CONFIG_DEFERRED_STRUCT_PAGE_INIT disabled.
Without this patch
------------------
Startup finished in 28.320s (kernel)
what. CONFIG_DEFERRED_STRUCT_PAGE_INIT is supposed to make bootup
faster.
Right, that's weird. Especially that it is still slower after these
changes.
CONFIG_DEFERRED_STRUCT_PAGE_INIT should be initializing in parallel
which ... should be faster.
@Donet, how many CPUs and nodes does your system have? Can you
identify what is taking longer than without
CONFIG_DEFERRED_STRUCT_PAGE_INIT?
My system has,
CPU - 1528
Holy cow.
Pure speculation: are we parallelizing *too much* ? :)
That's ~95 CPUs per node on average.
yes
Staring at deferred_init_memmap(), we do have
max_threads = deferred_page_init_max_threads(cpumask);
And that calls cpumask_weight(), essentially using all CPUs on the node.
... not sure what exactly happens if there are no CPUs for a node.
Okay.
I'm still debugging what's happening. I'll update you once I find something.
Node - 16
Are any of these memory-less?
No, there are no memory-less nodes. All nodes have around 2 TB of memory.
Memory - 31TB