> -----Original Message-----
> From: Ferruh Yigit <ferruh.yi...@amd.com>
> Sent: Thursday, December 22, 2022 8:07 PM
> To: You, KaisenX <kaisenx....@intel.com>; dev@dpdk.org; Burakov, Anatoly
> <anatoly.bura...@intel.com>; David Marchand
> <david.march...@redhat.com>
> Cc: sta...@dpdk.org; Yang, Qiming <qiming.y...@intel.com>; Zhou, YidingX
> <yidingx.z...@intel.com>; Wu, Jingjing <jingjing...@intel.com>; Xing, Beilei
> <beilei.x...@intel.com>; Zhang, Qi Z <qi.z.zh...@intel.com>; Luca Boccassi
> <bl...@debian.org>; Mcnamara, John <john.mcnam...@intel.com>; Kevin
> Traynor <ktray...@redhat.com>
> Subject: Re: [PATCH] net/iavf:fix slow memory allocation
> 
> On 12/22/2022 7:23 AM, You, KaisenX wrote:
> >
> >
> >> -----Original Message-----
> >> From: Ferruh Yigit <ferruh.yi...@amd.com>
> >> Sent: 2022年12月21日 21:49
> >> To: You, KaisenX <kaisenx....@intel.com>; dev@dpdk.org; Burakov,
> >> Anatoly <anatoly.bura...@intel.com>; David Marchand
> >> <david.march...@redhat.com>
> >> Cc: sta...@dpdk.org; Yang, Qiming <qiming.y...@intel.com>; Zhou,
> >> YidingX <yidingx.z...@intel.com>; Wu, Jingjing
> >> <jingjing...@intel.com>; Xing, Beilei <beilei.x...@intel.com>; Zhang,
> >> Qi Z <qi.z.zh...@intel.com>; Luca Boccassi <bl...@debian.org>;
> >> Mcnamara, John <john.mcnam...@intel.com>; Kevin Traynor
> >> <ktray...@redhat.com>
> >> Subject: Re: [PATCH] net/iavf:fix slow memory allocation
> >>
> >> On 12/20/2022 6:52 AM, You, KaisenX wrote:
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Ferruh Yigit <ferruh.yi...@amd.com>
> >>>> Sent: 2022年12月13日 21:28
> >>>> To: You, KaisenX <kaisenx....@intel.com>; dev@dpdk.org; Burakov,
> >>>> Anatoly <anatoly.bura...@intel.com>; David Marchand
> >>>> <david.march...@redhat.com>
> >>>> Cc: sta...@dpdk.org; Yang, Qiming <qiming.y...@intel.com>; Zhou,
> >>>> YidingX <yidingx.z...@intel.com>; Wu, Jingjing
> >>>> <jingjing...@intel.com>; Xing, Beilei <beilei.x...@intel.com>;
> >>>> Zhang, Qi Z <qi.z.zh...@intel.com>; Luca Boccassi
> >>>> <bl...@debian.org>; Mcnamara, John <john.mcnam...@intel.com>;
> Kevin
> >>>> Traynor <ktray...@redhat.com>
> >>>> Subject: Re: [PATCH] net/iavf:fix slow memory allocation
> >>>>
> >>>> On 12/13/2022 9:35 AM, Ferruh Yigit wrote:
> >>>>> On 12/13/2022 7:52 AM, You, KaisenX wrote:
> >>>>>>
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Ferruh Yigit <ferruh.yi...@amd.com>
> >>>>>>> Sent: 2022年12月8日 23:04
> >>>>>>> To: You, KaisenX <kaisenx....@intel.com>; dev@dpdk.org; Burakov,
> >>>>>>> Anatoly <anatoly.bura...@intel.com>; David Marchand
> >>>>>>> <david.march...@redhat.com>
> >>>>>>> Cc: sta...@dpdk.org; Yang, Qiming <qiming.y...@intel.com>; Zhou,
> >>>>>>> YidingX <yidingx.z...@intel.com>; Wu, Jingjing
> >>>>>>> <jingjing...@intel.com>; Xing, Beilei <beilei.x...@intel.com>;
> >>>>>>> Zhang, Qi Z <qi.z.zh...@intel.com>; Luca Boccassi
> >>>>>>> <bl...@debian.org>; Mcnamara, John
> <john.mcnam...@intel.com>;
> >>>> Kevin
> >>>>>>> Traynor <ktray...@redhat.com>
> >>>>>>> Subject: Re: [PATCH] net/iavf:fix slow memory allocation
> >>>>>>>
> >>>>>>> On 11/17/2022 6:57 AM, Kaisen You wrote:
> >>>>>>>> In some cases, the DPDK does not allocate hugepage heap memory
> >> to
> >>>>>>> some
> >>>>>>>> sockets due to the user setting parameters (e.g. -l 40-79,
> >>>>>>>> SOCKET
> >>>>>>>> 0 has no memory).
> >>>>>>>> When the interrupt thread runs on the corresponding core of
> >>>>>>>> this socket, each allocation/release will execute a whole set
> >>>>>>>> of heap allocation/release operations,resulting in poor
> performance.
> >>>>>>>> Instead we call malloc() to get memory from the system's heap
> >>>>>>>> space to fix this problem.
> >>>>>>>>
> >>>>>>>
> >>>>>>> Hi Kaisen,
> >>>>>>>
> >>>>>>> Using libc malloc can improve performance for this case, but I
> >>>>>>> would like to understand root cause of the problem.
> >>>>>>>
> >>>>>>>
> >>>>>>> As far as I can see, interrupt callbacks are run by interrupt
> >>>>>>> thread
> >>>>>>> ("eal-intr- thread"), and interrupt thread created by
> >>>> 'rte_ctrl_thread_create()' API.
> >>>>>>>
> >>>>>>> 'rte_ctrl_thread_create()' comment mentions that "CPU affinity
> >>>>>>> retrieved at the time 'rte_eal_init()' was called,"
> >>>>>>>
> >>>>>>> And 'rte_eal_init()' is run on main lcore, which is the first
> >>>>>>> lcore in the core list (unless otherwise defined with --main-lcore).
> >>>>>>>
> >>>>>>> So, the interrupts should be running on a core that has
> >>>>>>> hugepages allocated for it, am I missing something here?
> >>>>>>>
> >>>>>>>
> >>>>>> Thank for your comments.  Let me try to explain the root cause here:
> >>>>>> eal_intr_thread the CPU in the corresponding slot does not create
> >>>> memory pool.
> >>>>>> That results in frequent memory subsequently creating/destructing.
> >>>>>>
> >>>>>> When testpmd started, the parameter (e.g. -l 40-79) is set.
> >>>>>> Different OS has different topology. Some OS like SUSE only
> >>>>>> creates memory pool for one CPU slot, while other system creates for
> two.
> >>>>>> That is why the problem occurs when using memories in different OS.
> >>>>>
> >>>>>
> >>>>> It is testpmd application that decides from which socket to
> >>>>> allocate memory from, right. This is nothing specific to OS.
> >>>>>
> >>>>> As far as I remember, testpmd logic is too allocate from socket
> >>>>> that its cores are used (provided with -l parameter), and allocate
> >>>>> from socket that device is attached to.
> >>>>>
> >>>>> So, in a dual socket system, if all used cores are in socket 1 and
> >>>>> the NIC is in socket 1, no memory is allocated for socket 0. This
> >>>>> is to optimize memory consumption.
> >>>>>
> >>>>>
> >>>>> Can you please confirm that the problem you are observing is
> >>>>> because interrupt handler is running on a CPU, which doesn't have
> >>>>> memory allocated for its socket?
> >>>>>
> >>>>> In this case what I don't understand is why interrupts is not
> >>>>> running on main lcore, which should be first core in the list, for "-l 
> >>>>> 40-
> 79"
> >>>>> sample it should be lcore 40.
> >>>>> For your case, is interrupt handler run on core 0? Or any arbitrary 
> >>>>> core?
> >>>>> If so, can you please confirm when you provide core list as "-l 0,40-79"
> >>>>> fixes the issue?
> >>>>>
> >>> First of all, sorry to reply to you so late.
> >>> I can confirm that the problem I observed is because  interrupt
> >>> handler is running on a CPU, which doesn't have memory allocated for
> >>> its
> >> socket.
> >>>
> >>> In my case, interrupt handler is running on core 0.
> >>> I tried providing "-l 0,40-79" as a startup parameter, this issue
> >>> can be
> >> resolved.
> >>>
> >>> I corrected the previous statement that this problem does  only
> >>> occur on the SUSE system. In any OS, this problem occurs as long as
> >>> the range of startup parameters is only on node1.
> >>>
> >>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> And what about using 'rte_malloc_socket()' API (instead of
> >>>>>>> rte_malloc), which gets 'socket' as parameter, and provide the
> >>>>>>> socket that devices is on as parameter to this API? Is it
> >>>>>>> possible to test
> >>>> this?
> >>>>>>>
> >>>>>>>
> >>>>>> As to the reason for not using rte_malloc_socket. I thought
> >>>>>> rte_malloc_socket() could solve the problem too. And the
> >>>>>> appropriate parameter should be the socket_id that created the
> >>>>>> memory pool for DPDK initialization. Assuming that> the socket_id
> >>>>>> of the initially allocated memory = 1, first let the
> >>>>> eal_intr_thread
> >>>>>> determine if it is on the socket_id, then record this socket_id
> >>>>>> in the eal_intr_thread and pass it to the iavf_event_thread.  But
> >>>>>> there seems no way to link this parameter to the
> >>>>>> iavf_dev_event_post()
> >>>> function. That is why rte_malloc_socket is not used.
> >>>>>>
> >>>>>
> >>>>> I was thinking socket id of device can be used, but that won't
> >>>>> help if the core that interrupt handler runs is in different socket.
> >>>>> And I also don't know if there is a way to get socket that
> >>>>> interrupt thread is on. @David may help perhaps.
> >>>>>
> >>>>> So question is why interrupt thread is not running on main lcore.
> >>>>>
> >>>>
> >>>> OK after some talk with David, what I am missing is
> >> 'rte_ctrl_thread_create()'
> >>>> does NOT run on main lcore, it can run on any core except data
> >>>> plane
> >> cores.
> >>>>
> >>>> Driver "iavf-event-thread" thread (iavf_dev_event_handle()) and
> >>>> interrupt thread (so driver interrupt callback
> >>>> iavf_dev_event_post()) can run on any core, making it hard to manage.
> >>>> And it seems it is not possible to control where interrupt thread to run.
> >>>>
> >>>> One option can be allocating hugepages for all sockets, but this
> >>>> requires user involvement, and can't happen transparently.
> >>>>
> >>>> Other option can be to control where "iavf-event-thread" run, like
> >>>> using 'rte_thread_create()' to create thread and provide attribute
> >>>> to run it on main lcore (rte_lcore_cpuset(rte_get_main_lcore()))?
> >>>>
> >>>> Can you please test above option?
> >>>>
> >>>>
> >>> The first option can solve this issue. but to borrow from your
> >>> previous saying, "in a dual socket system, if all used cores are in
> >>> socket 1 and the NIC is in socket 1,  no memory is allocated for socket 0.
> >> This is to optimize memory consumption."
> >>> I think it's unreasonable to do so.
> >>>
> >>> About other option. In " rte_eal_intr_init" function, After the
> >>> thread is created, I set the thread affinity for eal-intr-thread,
> >>> but it does not solve
> >> this issue.
> >>
> >> Hi Kaisen,
> >>
> >> There are two threads involved,
> >>
> >> First one is interrupt thread, "eal-intr-thread", created by
> 'rte_eal_intr_init()'.
> >>
> >> Second one is iavf event handler, "iavf-event-thread", created by
> >> 'iavf_dev_event_handler_init()'.
> >>
> >> First one triggered by interrupt and puts a message to a list, second
> >> one consumes from the list and processes the message.
> >> So I assume two thread being in different sockets, or memory being
> >> allocated in a different socket than the cores running causes the
> >> performance issue.
> >>
> >> Did you test the second thread, "iavf-event-thread", affiliated to main
> core?
> >> (by creating thread using 'rte_thread_create()' API)
> >>
> >>
> > I tried to use ''rte_thread_create() 'API creates the second thread,
> > but this issue still exists.
> >
> > Because malloc is executed by "eal_intr_thread", it has nothing to do
> > with "iavf_event_thread".
> >
> 
> Since 'iavf_event_element' (pos which is allocated by malloc()  accessed and
> freed in "iavf-event-thread", it could be related. But if it doesn't fix that 
> is OK,
> thanks for testing.
> 
> > But I found a patch similar to my issue:
> > https://patchwork.dpdk.org/project/dpdk/patch/20221221104858.296530-
> 1-
> > david.march...@redhat.com/ According to the patch modification, this
> > issue can be solved.
> >
> 
> I guess that patch inspired from this discussion, and if it fixes the issue, I
> prefer that one as generic solution.

+1 

That looks like the feature that DPDK should have.

Reply via email to