On 2/1/2023 12:20 PM, Kaisen You wrote:
Trying to allocate memory on the first detected numa node has less
chance to find some memory actually available rather than on the main
lcore numa node (especially when the DPDK application is started only
on one numa node).

Fixes: 705356f0811f ("eal: simplify control thread creation")
Fixes: bb0bd346d5c1 ("eal: suggest using --lcores option")
Cc: sta...@dpdk.org

Signed-off-by: David Marchand <david.march...@redhat.com>
Signed-off-by: Kaisen You <kaisenx....@intel.com>
---
Changes since v4:
- mod the patch title,

Changes since v3:
- add the assignment of socket_id in thread initialization,

Changes since v2:
- add uncommitted local change and fix compilation,

Changes since v1:
- accomodate for configurations with main lcore running on multiples
   physical cores belonging to different numa,
---
  lib/eal/common/eal_common_thread.c | 1 +
  lib/eal/common/malloc_heap.c       | 4 ++++
  2 files changed, 5 insertions(+)

diff --git a/lib/eal/common/eal_common_thread.c 
b/lib/eal/common/eal_common_thread.c
index 38d83a6885..21bff971f8 100644
--- a/lib/eal/common/eal_common_thread.c
+++ b/lib/eal/common/eal_common_thread.c
@@ -251,6 +251,7 @@ static void *ctrl_thread_init(void *arg)
        void *routine_arg = params->arg;
__rte_thread_init(rte_lcore_id(), cpuset);
+       RTE_PER_LCORE(_socket_id) = SOCKET_ID_ANY;
        params->ret = rte_thread_set_affinity_by_id(rte_thread_self(), cpuset);
        if (params->ret != 0) {
                __atomic_store_n(&params->ctrl_thread_status,
diff --git a/lib/eal/common/malloc_heap.c b/lib/eal/common/malloc_heap.c
index d7c410b786..3ee19aee15 100644
--- a/lib/eal/common/malloc_heap.c
+++ b/lib/eal/common/malloc_heap.c
@@ -717,6 +717,10 @@ malloc_get_numa_socket(void)
                        return socket_id;
        }
+ socket_id = rte_lcore_to_socket_id(rte_get_main_lcore());
+       if (socket_id != (unsigned int)SOCKET_ID_ANY)
+               return socket_id;
+
        return rte_socket_id_by_idx(0);
  }

I may be lacking context, but I don't quite get the suggested change. From what I understand, the original has to do with assigning lcore cpusets in such a way that an lcore ends up having two socket ID's (because it's been assigned to CPU's on different sockets). Why is this allowed in the first place? It seems like a user error to me, as it breaks many of the fundamental assumptions DPDK makes.

I'm fine with using main lcore socket for control threads, I just don't think the `socket_id != SOCKET_ID_ANY` thing should be checked here, because it apparently tries to compensate for a problem with cpuset of the main thread, which shouldn't have happened to begin with.

--
Thanks,
Anatoly

Reply via email to