[dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

Hanoch Haim (hhaim) Tue, 5 Jan 2016 11:11:24 +0000

Hi Oliver, 
Thank you for the fast response and it would be great to open a discussion on 
that.
In general our project can leverage your optimization and I think it is great 
(we should have thought about it) . We can use it using the workaround I 
described.
However, for me it  seems odd that  rte_pktmbuf_attach () that does not 
*change* anything in m_const, except of the *atomic* ref counter does not work 
in parallel.
The example I gave is a classic use case of rte_pktmbuf_attach  (multicast ) 
and I don't see why it wouldn't work after your optimization.


Do you have a pointer to the documentation that state that that you can't call 
the atomic ref counter from more than one thread?

Thanks,
Hanoh

-----Original Message-----
From: Olivier MATZ [mailto:[email protected]] 
Sent: Tuesday, January 05, 2016 12:58 PM
To: Hanoch Haim (hhaim); bruce.richardson at intel.com
Cc: dev at dpdk.org; Ido Barnea (ibarnea); Itay Marom (imarom)
Subject: Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

Hi Hanoch,

On 01/04/2016 03:43 PM, Hanoch Haim (hhaim) wrote:
> Hi Oliver,
>
> Let's take your drawing as a reference and add my question The use 
> case is sending a duplicate multicast packet by many threads.
> I can split it to x threads to do the job and with atomic-ref (my multicast 
> not mbuf) count it until it reaches zero.
>
> In my following example the two cores (0 and 1) sending the indirect 
> m1/m2 do alloc/attach/send
>
>      core0                                 |  core1
> ---------------------------------                         
> |---------------------------------------
> m_const=rte_pktmbuf_alloc(mp)             |
>                                                                    |
> while true:                                                 |  while True:
>    m1 =rte_pktmbuf_alloc(mp_64)             |    m2 =rte_pktmbuf_alloc(mp_64)
>    rte_pktmbuf_attach(m1, m_const)         |    rte_pktmbuf_attach(m1, 
> m_const)
>    tx_burst(m1)                                           |    tx_burst(m2)
>
> Is this example is not valid?

For me, m_const is not expected to be used concurrently on several cores. By 
"used", I mean calling a function that modifies the mbuf, which is the case for 
rte_pktmbuf_attach().

> BTW this is our workaround
>
>
>    core0                                          |   core1
> ---------------------------------                  
> |---------------------------------------
> m_const=rte_pktmbuf_alloc(mp)      |
> rte_mbuf_refcnt_update(m_const,1)| <<-- workaround
>                                                             |
> while true:                                          |  while True:
>    m1 =rte_pktmbuf_alloc(mp_64)      |    m2 =rte_pktmbuf_alloc(mp_64)
>    rte_pktmbuf_attach(m1, m_const)  |    rte_pktmbuf_attach(m1, m_const)
>    tx_burst(m1)                                     |    tx_burst(m2)

This workaround indeed solves the issue. Another solution would be to protect 
the call to attach() with a lock, or call all the
rte_pktmbuf_attach() on the same core.

I'm open to discuss this behavior for rte_pktmbuf_attach() function (should 
concurrent calls be allowed or not). In any case, we may want to better 
document it in the doxygen API comments.


Regards,
Olivier

[dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

Reply via email to