Hi Anatoly,
Thanks. My recollection is that all of the NUMA configuration flags
were set to 'n'.
Regards,
Nick
On 17/09/2020 13:57, Burakov, Anatoly wrote:
On 17-Sep-20 1:29 PM, Nick Connolly wrote:
Hi Anatoly,
Thanks for the response. You are asking a good question - here's
what I know:
The issue arose on a single socket system, running WSL2 (full Linux
kernel running as a lightweight VM under Windows).
The default kernel in this environment is built with CONFIG_NUMA=n
which means get_mempolicy() returns an error.
This causes the check to ensure that the allocated memory is
associated with the correct socket to fail.
The change is to skip the allocation check if check_numa() indicates
that NUMA-aware memory is not supported.
Researching the meaning of CONFIG_NUMA, I found
https://cateee.net/lkddb/web-lkddb/NUMA.html which says:
Enable NUMA (Non-Uniform Memory Access) support.
The kernel will try to allocate memory used by a CPU on the local
memory controller of the CPU and add some more NUMA awareness to the
kernel.
Clearly CONFIG_NUMA enables memory awareness, but there's no
indication in the description whether information about the NUMA
physical architecture is 'hidden', or whether it is still exposed
through /sys/devices/system/node* (which is used by the rte
initialisation code to determine how many sockets there are).
Unfortunately, I don't have ready access to a multi-socket Linux
system that I can test this out on, so I took the conservative
approach that it may be possible to have CONFIG_NUMA disabled, but
the kernel still report more than one node, and coded the change to
generate a debug message if this occurs.
Do you know whether CONFIG_NUMA turns off all knowledge about the
hardware architecture? If it does, then I agree that the test for
rte_socket_count() serves no purpose and should be removed.
I have a system with a custom compiled kernel, i can recompile it
without this flag and test this. I'll report back with results :)