> -----Original Message----- > From: Richardson, Bruce > Sent: Wednesday, June 15, 2016 3:22 PM > To: Ananyev, Konstantin > Cc: Ivan Boule; Thomas Monjalon; Pattan, Reshma; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx > callback lists > > On Wed, Jun 15, 2016 at 03:20:27PM +0100, Ananyev, Konstantin wrote: > > > > > > > -----Original Message----- > > > From: Ivan Boule [mailto:ivan.boule at 6wind.com] > > > Sent: Wednesday, June 15, 2016 3:07 PM > > > To: Richardson, Bruce; Ananyev, Konstantin > > > Cc: Thomas Monjalon; Pattan, Reshma; dev at dpdk.org > > > Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx > > > callback lists > > > > > > On 06/15/2016 03:29 PM, Bruce Richardson wrote: > > > > On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote: > > > >> Hi Ivan, > > > >> > > > >>> -----Original Message----- > > > >>> From: Ivan Boule [mailto:ivan.boule at 6wind.com] > > > >>> Sent: Wednesday, June 15, 2016 1:15 PM > > > >>> To: Thomas Monjalon; Ananyev, Konstantin > > > >>> Cc: Pattan, Reshma; dev at dpdk.org > > > >>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect > > > >>> Rx/Tx callback lists > > > >>> > > > >>> On 06/15/2016 10:48 AM, Thomas Monjalon wrote: > > > >>> > > > >>>>> > > > >>>>>> I think the read access would need locking but we do not want it > > > >>>>>> in fast path. > > > >>>>> > > > >>>>> I don't think it would be needed. > > > >>>>> As I said - read/write interaction didn't change from what we have > > > >>>>> right now. > > > >>>>> But if you have some particular scenario in mind that you believe > > > >>>>> would cause > > > >>>>> a race condition - please speak up. > > > >>>> > > > >>>> If we add/remove a callback during a burst? Is it possible that the > > > >>>> next > > > >>>> pointer would have a wrong value leading to a crash? > > > >>>> Maybe we need a comment to state that we should not alter burst > > > >>>> callbacks while running burst functions. > > > >>>> > > > >>> > > > >>> Hi Reshma, > > > >>> > > > >>> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in > > > >>> the > > > >>> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" > > > >>> list > > > >>> of RX callbacks associated with the polled RX queue to safely remove > > > >>> RX > > > >>> callback(s) in parallel. > > > >>> The problem is not [only] with the setting and the loading of > > > >>> "cb->next" > > > >>> that you assume to be atomic operations, which is certainly true on > > > >>> most > > > >>> CPUs. > > > >>> I see the 2 important following issues: > > > >>> > > > >>> 1) the "rte_eth_rxtx_callback" data structure associated with a > > > >>> removed > > > >>> RX callback could still be accessed in the callback parsing loop of > > > >>> the > > > >>> function "rte_eth_rx_burst()" after having been freed in parallel. > > > >>> > > > >>> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE > > > >>> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()" that > > > >>> does not free the "rte_eth_rxtx_callback" data structure associated > > > >>> with > > > >>> the removed callback ! > > > >> > > > >> Yes, though it is documented behaviour, someone can probably > > > >> refer it as a feature, not a bug ;) > > > >> > > > > > > > > +1 > > > > This is definitely not a bug, this is absolutely by design. One may > > > > argue with > > > > the design, but it was done for a definite reason, so as to avoid > > > > paying the > > > > penalty of having locks. It pushes more responsibility onto the app, > > > > but it > > > > does allow the app to choose the best solution for managing the freeing > > > > of > > > > memory for its situation. The alternative is to force all apps to pay > > > > the cost > > > > of having locks, even if better options for freeing the memory are > > > > available. > > > > > > > > /Bruce > > > > > > > > > > -1 (not to say 0xFFFFFFFF) > > > > > > This is definitely an API design bug ! > > > I would say that if you don't know how to free a resource that you > > > allocate, it is very likely that you are wrong allocating it. > > > And this is exactly what happens here with RX/TX callback data structures. > > > This problem can easily be addressed by just changing the API as follows: > > > > > > Change > > > void * > > > rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id, > > > rte_rx_callback_fn fn, void *user_param) > > > > > > to > > > int > > > rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id, > > > struct rte_eth_rxtx_callback *cb) > > > > > > In addition of solving the problem, this approach makes the API > > > consistent and let the application allocate "rte_eth_rxtx_callback" data > > > structures through any appropriate mean. > > > > That might make API a bit cleaner, but I don't see how it fixes the generic > > problem: > > there is still no way to know by the upper layer when it is safe to > > free/re-use > > removed callback, but to make sure that all IO on that queue is stopped > > (I.E. some external synchronisation around the queue). > > Actually, it allows other, more creative solutions, like an app having a > statically allocated set/pool of callback structures that it doesn't need to > allocate or free at all.
I understand that, and as I said I am not against that change. But it doesn't solve the problem in general. Ok, if you have a static object, you wouldn't need to call free() for it, but you still want to re-use it after remove_cb() has finished, right? So there is no actual difference - the main question is: at what point after remove_cb() it is safe to modify it's contents. Konstantin > > /Bruce