On Fri, Oct 15, 2021 at 01:07:42PM +0000, Dmitry Kozlyuk wrote:
[...]
> > > +static void
> > > +mempool_event_callback_invoke(enum rte_mempool_event event,
> > > +                           struct rte_mempool *mp)
> > > +{
> > > +     struct mempool_callback_list *list;
> > > +     struct rte_tailq_entry *te;
> > > +     void *tmp_te;
> > > +
> > > +     rte_mcfg_tailq_read_lock();
> > > +     list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list);
> > > +     RTE_TAILQ_FOREACH_SAFE(te, list, next, tmp_te) {
> > > +             struct mempool_callback *cb = te->data;
> > > +             rte_mcfg_tailq_read_unlock();
> > > +             cb->func(event, mp, cb->user_data);
> > > +             rte_mcfg_tailq_read_lock();
> > 
> > I think it is dangerous to unlock the list before invoking the callback.
> > During that time, another thread can remove the next mempool callback, and
> > the next iteration will access to a freed element, causing an undefined
> > behavior.
> > 
> > Is it a problem to keep the lock held during the callback invocation?
> > 
> > I see that you have a test for this, and that you wrote a comment in the
> > documentation:
> > 
> >  * rte_mempool_event_callback_register() may be called from within the
> > callback,
> >  * but the callbacks registered this way will not be invoked for the same
> > event.
> >  * rte_mempool_event_callback_unregister() may only be safely called
> >  * to remove the running callback.
> > 
> > But is there a use-case for this?
> > If no, I'll tend to say that we can document that it is not allowed to
> > create, free or list mempools or register cb from the callback.
> 
> There is no use-case, but I'd argue for releasing the lock.
> This lock is taken by rte_xxx_create() functions in many libraries,
> so the restriction is wider and, worse, it is not strictly limited.

Yes... I honnestly don't understand why every library uses the same
lock rte_mcfg_tailq if the only code that accesses the list is in the
library itself. Maybe I'm missing something.

I have the impression that having only one mempool lock for all usages in
mempool would be simpler and more specific. It would allow to keep the lock held
while invoking the callbacks without blocking accesses to the other libs, and
would also solve the problem described below.



> > [...]
> > > +int
> > > +rte_mempool_event_callback_unregister(rte_mempool_event_callback *func,
> > > +                                   void *user_data)
> > > +{
> > > +     struct mempool_callback_list *list;
> > > +     struct rte_tailq_entry *te = NULL;
> > > +     struct mempool_callback *cb;
> > > +     int ret;
> > > +
> > > +     if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > +             rte_errno = EPERM;
> > > +             return -1;
> > > +     }
> > 
> > The help of the register function says
> >  * Callbacks will be invoked in the process that creates the mempool.
> 
> BTW, this is another bug, it should be "populates", not "creates".
> 
> > So registration is allowed from primary or secondary process. Can't a
> > secondary process destroys the callback it has loaded?
> > 
> > > +
> > > +     rte_mcfg_mempool_read_lock();
> > > +     rte_mcfg_tailq_write_lock();
> > 
> > I don't understand why there are 2 locks here.
> > 
> > After looking at the code, I think the locking model is already
> > incorrect in current mempool code:
> > 
> >    rte_mcfg_tailq_write_lock() is used in create and free to protect the
> >      access to the mempool tailq
> > 
> >    rte_mcfg_mempool_write_lock() is used in create(), to protect from
> >      concurrent creation (with same name for instance), but I doubt it
> >      is absolutly needed, because memzone_reserve is already protected.
> > 
> >    rte_mcfg_mempool_read_lock() is used in dump functions, but to me
> >      it should use rte_mcfg_tailq_read_lock() instead.
> >      Currently, doing a dump and a free concurrently can cause a crash
> >      because they are not using the same lock.
> > 
> > In your case, I suggest to use only one lock to protect the callback
> > list. I think it can be rte_mcfg_tailq_*_lock().
> 
> Thanks, I will double-check the locking.

Reply via email to