On 06/15/2016 03:29 PM, Bruce Richardson wrote: > On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote: >> Hi Ivan, >> >>> -----Original Message----- >>> From: Ivan Boule [mailto:ivan.boule at 6wind.com] >>> Sent: Wednesday, June 15, 2016 1:15 PM >>> To: Thomas Monjalon; Ananyev, Konstantin >>> Cc: Pattan, Reshma; dev at dpdk.org >>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx >>> callback lists >>> >>> On 06/15/2016 10:48 AM, Thomas Monjalon wrote: >>> >>>>> >>>>>> I think the read access would need locking but we do not want it >>>>>> in fast path. >>>>> >>>>> I don't think it would be needed. >>>>> As I said - read/write interaction didn't change from what we have right >>>>> now. >>>>> But if you have some particular scenario in mind that you believe would >>>>> cause >>>>> a race condition - please speak up. >>>> >>>> If we add/remove a callback during a burst? Is it possible that the next >>>> pointer would have a wrong value leading to a crash? >>>> Maybe we need a comment to state that we should not alter burst >>>> callbacks while running burst functions. >>>> >>> >>> Hi Reshma, >>> >>> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the >>> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list >>> of RX callbacks associated with the polled RX queue to safely remove RX >>> callback(s) in parallel. >>> The problem is not [only] with the setting and the loading of "cb->next" >>> that you assume to be atomic operations, which is certainly true on most >>> CPUs. >>> I see the 2 important following issues: >>> >>> 1) the "rte_eth_rxtx_callback" data structure associated with a removed >>> RX callback could still be accessed in the callback parsing loop of the >>> function "rte_eth_rx_burst()" after having been freed in parallel. >>> >>> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE >>> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()" that >>> does not free the "rte_eth_rxtx_callback" data structure associated with >>> the removed callback ! >> >> Yes, though it is documented behaviour, someone can probably >> refer it as a feature, not a bug ;) >> > > +1 > This is definitely not a bug, this is absolutely by design. One may argue with > the design, but it was done for a definite reason, so as to avoid paying the > penalty of having locks. It pushes more responsibility onto the app, but it > does allow the app to choose the best solution for managing the freeing of > memory for its situation. The alternative is to force all apps to pay the cost > of having locks, even if better options for freeing the memory are available. > > /Bruce >
-1 (not to say 0xFFFFFFFF) This is definitely an API design bug ! I would say that if you don't know how to free a resource that you allocate, it is very likely that you are wrong allocating it. And this is exactly what happens here with RX/TX callback data structures. This problem can easily be addressed by just changing the API as follows: Change void * rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id, rte_rx_callback_fn fn, void *user_param) to int rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id, struct rte_eth_rxtx_callback *cb) In addition of solving the problem, this approach makes the API consistent and let the application allocate "rte_eth_rxtx_callback" data structures through any appropriate mean. Regards, Ivan -- Ivan Boule 6WIND Development Engineer