I'm not comfortable with this patch. First, there is no comment in the code which helps to understand the logic. Second, I'm afraid changing the value of the per-core variable _socket_id may have an impact on some applications.
16/02/2023 03:50, You, KaisenX: > From: Burakov, Anatoly <anatoly.bura...@intel.com> > > On 2/1/2023 12:20 PM, Kaisen You wrote: > > > Trying to allocate memory on the first detected numa node has less > > > chance to find some memory actually available rather than on the main > > > lcore numa node (especially when the DPDK application is started only > > > on one numa node). > > > > > > Fixes: 705356f0811f ("eal: simplify control thread creation") > > > Fixes: bb0bd346d5c1 ("eal: suggest using --lcores option") > > > Cc: sta...@dpdk.org > > > > > > Signed-off-by: David Marchand <david.march...@redhat.com> > > > Signed-off-by: Kaisen You <kaisenx....@intel.com> > > > --- > > > Changes since v4: > > > - mod the patch title, > > > > > > Changes since v3: > > > - add the assignment of socket_id in thread initialization, > > > > > > Changes since v2: > > > - add uncommitted local change and fix compilation, > > > > > > Changes since v1: > > > - accomodate for configurations with main lcore running on multiples > > > physical cores belonging to different numa, > > > --- > > > lib/eal/common/eal_common_thread.c | 1 + > > > lib/eal/common/malloc_heap.c | 4 ++++ > > > 2 files changed, 5 insertions(+) > > > > > > diff --git a/lib/eal/common/eal_common_thread.c > > > b/lib/eal/common/eal_common_thread.c > > > index 38d83a6885..21bff971f8 100644 > > > --- a/lib/eal/common/eal_common_thread.c > > > +++ b/lib/eal/common/eal_common_thread.c > > > @@ -251,6 +251,7 @@ static void *ctrl_thread_init(void *arg) > > > void *routine_arg = params->arg; > > > > > > __rte_thread_init(rte_lcore_id(), cpuset); > > > + RTE_PER_LCORE(_socket_id) = SOCKET_ID_ANY; > > > params->ret = rte_thread_set_affinity_by_id(rte_thread_self(), > > cpuset); > > > if (params->ret != 0) { > > > __atomic_store_n(¶ms->ctrl_thread_status, > > > diff --git a/lib/eal/common/malloc_heap.c > > > b/lib/eal/common/malloc_heap.c index d7c410b786..3ee19aee15 100644 > > > --- a/lib/eal/common/malloc_heap.c > > > +++ b/lib/eal/common/malloc_heap.c > > > @@ -717,6 +717,10 @@ malloc_get_numa_socket(void) > > > return socket_id; > > > } > > > > > > + socket_id = rte_lcore_to_socket_id(rte_get_main_lcore()); > > > + if (socket_id != (unsigned int)SOCKET_ID_ANY) > > > + return socket_id; > > > + > > > return rte_socket_id_by_idx(0); > > > } > > > > > > > I may be lacking context, but I don't quite get the suggested change. > > From what I understand, the original has to do with assigning lcore > > cpusets in > > such a way that an lcore ends up having two socket ID's (because it's been > > assigned to CPU's on different sockets). Why is this allowed in the first > > place? > > It seems like a user error to me, as it breaks many of the fundamental > > assumptions DPDK makes. > > > In a dual socket system, if all used cores are in socket 1 and the NIC is in > socket 1, > no memory is allocated for socket 0. This is to optimize memory consumption. > > I agree with you. If the startup parameters can ensure that both sockets > allocate memory, there will be no problem. > However, due to the different CPU topologies of different systems, > It is difficult for users to ensure that the startup parameter contains two > cpu nodes. > > > I'm fine with using main lcore socket for control threads, I just don't > > think the > > `socket_id != SOCKET_ID_ANY` thing should be checked here, because it > > apparently tries to compensate for a problem with cpuset of the main thread, > > which shouldn't have happened to begin with. > > > This issue has been explained in detail in the discussion of the patch v1 > version. > I will forward the previous email to you. The content of the email will also > better > let you know the purpose of submitting this patch.