Hi Oliver, Thank you for the fast response and it would be great to open a discussion on that. In general our project can leverage your optimization and I think it is great (we should have thought about it) . We can use it using the workaround I described. However, for me it seems odd that rte_pktmbuf_attach () that does not *change* anything in m_const, except of the *atomic* ref counter does not work in parallel. The example I gave is a classic use case of rte_pktmbuf_attach (multicast ) and I don't see why it wouldn't work after your optimization.
Do you have a pointer to the documentation that state that that you can't call the atomic ref counter from more than one thread? Thanks, Hanoh -----Original Message----- From: Olivier MATZ [mailto:olivier.m...@6wind.com] Sent: Tuesday, January 05, 2016 12:58 PM To: Hanoch Haim (hhaim); bruce.richardson at intel.com Cc: dev at dpdk.org; Ido Barnea (ibarnea); Itay Marom (imarom) Subject: Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update Hi Hanoch, On 01/04/2016 03:43 PM, Hanoch Haim (hhaim) wrote: > Hi Oliver, > > Let's take your drawing as a reference and add my question The use > case is sending a duplicate multicast packet by many threads. > I can split it to x threads to do the job and with atomic-ref (my multicast > not mbuf) count it until it reaches zero. > > In my following example the two cores (0 and 1) sending the indirect > m1/m2 do alloc/attach/send > > core0 | core1 > --------------------------------- > |--------------------------------------- > m_const=rte_pktmbuf_alloc(mp) | > | > while true: | while True: > m1 =rte_pktmbuf_alloc(mp_64) | m2 =rte_pktmbuf_alloc(mp_64) > rte_pktmbuf_attach(m1, m_const) | rte_pktmbuf_attach(m1, > m_const) > tx_burst(m1) | tx_burst(m2) > > Is this example is not valid? For me, m_const is not expected to be used concurrently on several cores. By "used", I mean calling a function that modifies the mbuf, which is the case for rte_pktmbuf_attach(). > BTW this is our workaround > > > core0 | core1 > --------------------------------- > |--------------------------------------- > m_const=rte_pktmbuf_alloc(mp) | > rte_mbuf_refcnt_update(m_const,1)| <<-- workaround > | > while true: | while True: > m1 =rte_pktmbuf_alloc(mp_64) | m2 =rte_pktmbuf_alloc(mp_64) > rte_pktmbuf_attach(m1, m_const) | rte_pktmbuf_attach(m1, m_const) > tx_burst(m1) | tx_burst(m2) This workaround indeed solves the issue. Another solution would be to protect the call to attach() with a lock, or call all the rte_pktmbuf_attach() on the same core. I'm open to discuss this behavior for rte_pktmbuf_attach() function (should concurrent calls be allowed or not). In any case, we may want to better document it in the doxygen API comments. Regards, Olivier