Hi David,

First of all, I sincerely apologize for the late reply.

I had checked this issue carefully and had some useful findings.

On Wed, Dec 21, 2022 at 22:57 PM, David Marchand wrote:
Hello Min,

On Wed, Dec 21, 2022 at 11:49 AM David Marchand
<david.march...@redhat.com> wrote:
Trying to allocate memory on the first detected numa node has less
chance to find some memory actually available rather than on the main
lcore numa node (especially when the DPDK application is started only
on one numa node).

Signed-off-by: David Marchand <david.march...@redhat.com>
I see a failure in the loongarch CI.

Running binary with
argv[]:'/home/zhoumin/dpdk/build/app/test/dpdk-test'
'--file-prefix=eal_flags_c_opt_autotest' '--proc-type=secondary'
'--lcores' '0-1,2@(5-7),(3-5)@(0,2),(0,6),7'
Error - process did not run ok with valid corelist value
Test Failed

The logs don't give the full picture (though it is not LoongArch CI fault).

I tried to read back on past mail exchanges about the loongarch
server, but I did not find the info.
I suspect cores 5 to 7 belong to different numa nodes, can you confirm?

The cores 5 to 7 belong to the same numa node (NUMA node1) on the Loongson-3C5000LL CPU on which LoongArch DPDK CI runs.


I'll post a new revision to account for this case.


The LoongArch DPDK CI uses the core 0-7 to run all the DPDK unit tests by adding the arg '-l 0-7' in the meson test args. In the above test case, the arg '--lcores' '0-1,2@(5-7),(3-5)@(0,2),(0,6),7' will make the lcore 0 and 6 to run on the core 0 or 6. The logs of EAL will make it more clear when I set the log level of EAL to debug as follows:
EAL: Main lcore 0 is ready (tid=fff3ee18f0;cpuset=[0,6])
EAL: lcore 1 is ready (tid=fff2de4cf0;cpuset=[1])
EAL: lcore 2 is ready (tid=fff25e0cf0;cpuset=[5,6,7])
EAL: lcore 5 is ready (tid=fff0dd4cf0;cpuset=[0,2])
EAL: lcore 4 is ready (tid=fff15d8cf0;cpuset=[0,2])
EAL: lcore 3 is ready (tid=fff1ddccf0;cpuset=[0,2])
EAL: lcore 7 is ready (tid=ffdb7f8cf0;cpuset=[7])
EAL: lcore 6 is ready (tid=ffdbffccf0;cpuset=[0,6])

However, The cores 0 and 6 belong to different numa nodes on the Loongson-3C5000LL CPU. The core 0 belongs to NUMA node 0 and the core 6 belongs to NUMA node 1 as follows:
$ lscpu
Architecture:        loongarch64
Byte Order:          Little Endian
CPU(s):              32
On-line CPU(s) list: 0-31
Thread(s) per core:  1
Core(s) per socket:  4
Socket(s):           8
NUMA node(s):        8
...
NUMA node0 CPU(s):   0-3
NUMA node1 CPU(s):   4-7
NUMA node2 CPU(s):   8-11
NUMA node3 CPU(s):   12-15
NUMA node4 CPU(s):   16-19
NUMA node5 CPU(s):   20-23
NUMA node6 CPU(s):   24-27
NUMA node7 CPU(s):   28-31
...

So the socket_id for the lcore 0 and 6 will be set to -1 which can be seen from the thread_update_affinity(). Meanwhile, I print out the socket_id for the lcore 0 to RTE_MAX_LCORE - 1 as follows: lcore_config[*].socket_id: -1 0 1 0 0 0 -1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5 5 6 6 6 6 7 7 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

In this test case, the modified malloc_get_numa_socket() will return -1 which caused a memory allocation failure. Whether it is acceptable in DPDK that the socket_id for a lcore is -1? If it's ok, maybe we can check the socket_id of main lcore before using it, such as:
diff --git a/lib/eal/common/malloc_heap.c b/lib/eal/common/malloc_heap.c
index d7c410b786..3ee19aee15 100644
--- a/lib/eal/common/malloc_heap.c
+++ b/lib/eal/common/malloc_heap.c
@@ -717,6 +717,10 @@ malloc_get_numa_socket(void)
                        return socket_id;
        }

+       socket_id = rte_lcore_to_socket_id(rte_get_main_lcore());
+       if (socket_id != (unsigned int)SOCKET_ID_ANY)
+               return socket_id;
+
        return rte_socket_id_by_idx(0);
 }

Reply via email to