On Tue, Jun 30, 2020 at 12:07 PM Olivier Matz <olivier.m...@6wind.com> wrote:
>
> On Fri, Jun 26, 2020 at 04:47:33PM +0200, David Marchand wrote:
> > DPDK allows calling some part of its API from a non-EAL thread but this
> > has some limitations.
> > OVS (and other applications) has its own thread management but still
> > want to avoid such limitations by hacking RTE_PER_LCORE(_lcore_id) and
> > faking EAL threads potentially unknown of some DPDK component.
> >
> > Introduce a new API to register non-EAL thread and associate them to a
> > free lcore with a new NON_EAL role.
> > This role denotes lcores that do not run DPDK mainloop and as such
> > prevents use of rte_eal_wait_lcore() and consorts.
> >
> > Multiprocess is not supported as the need for cohabitation with this new
> > feature is unclear at the moment.
> >
> > Signed-off-by: David Marchand <david.march...@redhat.com>
> > Acked-by: Andrew Rybchenko <arybche...@solarflare.com>
> > ---
> > Changes since v2:
> > - refused multiprocess init once rte_thread_register got called, and
> >   vice versa,
> > - added warning on multiprocess in rte_thread_register doxygen,
> >
> > Changes since v1:
> > - moved cleanup on lcore role code in patch 5,
> > - added unit test,
> > - updated documentation,
> > - changed naming from "external thread" to "registered non-EAL thread"
> >
> > ---
> >  MAINTAINERS                                   |   1 +
> >  app/test/Makefile                             |   1 +
> >  app/test/autotest_data.py                     |   6 +
> >  app/test/meson.build                          |   2 +
> >  app/test/test_lcores.c                        | 139 ++++++++++++++++++
> >  doc/guides/howto/debug_troubleshoot.rst       |   5 +-
> >  .../prog_guide/env_abstraction_layer.rst      |  22 +--
> >  doc/guides/prog_guide/mempool_lib.rst         |   2 +-
> >  lib/librte_eal/common/eal_common_lcore.c      |  50 ++++++-
> >  lib/librte_eal/common/eal_common_mcfg.c       |  36 +++++
> >  lib/librte_eal/common/eal_common_thread.c     |  33 +++++
> >  lib/librte_eal/common/eal_memcfg.h            |  10 ++
> >  lib/librte_eal/common/eal_private.h           |  18 +++
> >  lib/librte_eal/freebsd/eal.c                  |   4 +
> >  lib/librte_eal/include/rte_lcore.h            |  25 +++-
> >  lib/librte_eal/linux/eal.c                    |   4 +
> >  lib/librte_eal/rte_eal_version.map            |   2 +
> >  lib/librte_mempool/rte_mempool.h              |  11 +-
> >  18 files changed, 349 insertions(+), 22 deletions(-)
> >  create mode 100644 app/test/test_lcores.c
> >
>
> [...]
>
> > diff --git a/app/test/test_lcores.c b/app/test/test_lcores.c
> > new file mode 100644
> > index 0000000000..864bcbade7
> > --- /dev/null
> > +++ b/app/test/test_lcores.c
> > @@ -0,0 +1,139 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright (c) 2020 Red Hat, Inc.
> > + */
> > +
> > +#include <pthread.h>
> > +#include <string.h>
> > +
> > +#include <rte_lcore.h>
> > +
> > +#include "test.h"
> > +
> > +struct thread_context {
> > +     enum { INIT, ERROR, DONE } state;
> > +     bool lcore_id_any;
> > +     pthread_t id;
> > +     unsigned int *registered_count;
> > +};
> > +static void *thread_loop(void *arg)
> > +{
>
> missing an empty line here
>
> > +     struct thread_context *t = arg;
> > +     unsigned int lcore_id;
> > +
> > +     lcore_id = rte_lcore_id();
> > +     if (lcore_id != LCORE_ID_ANY) {
> > +             printf("Incorrect lcore id for new thread %u\n", lcore_id);
> > +             t->state = ERROR;
> > +     }
> > +     rte_thread_register();
> > +     lcore_id = rte_lcore_id();
> > +     if ((t->lcore_id_any && lcore_id != LCORE_ID_ANY) ||
> > +                     (!t->lcore_id_any && lcore_id == LCORE_ID_ANY)) {
> > +             printf("Could not register new thread, got %u while 
> > %sexpecting %u\n",
> > +                     lcore_id, t->lcore_id_any ? "" : "not ", 
> > LCORE_ID_ANY);
> > +             t->state = ERROR;
> > +     }
>
> To check if rte_thread_register() succedeed, we need to look at
> lcore_id. I wonder if rte_thread_register() shouldn't return the lcore
> id on success, and -1 on error (rte_errno could be set to give some
> info on the error).

lcore_id are unsigned integers with the special value LCORE_ID_ANY
mapped to UINT32_MAX (should be UINT_MAX? anyway...).

rte_thread_register could return an error code as there are no ERROR
level logs about why a lcore allocation failed.
We could then distinguish a shortage of lcore (or init callback
refusal) from an invalid call before rte_eal_init() or when mp is in
use.

About returning the lcore_id as part of the return code, this would
map to -1 for LCORE_ID_ANY.
This is probably not a problem but still odd.


>
> The same could be done for rte_thread_init()

?
Not sure where this one could fail.


>
> [...]
>
> > diff --git a/lib/librte_eal/common/eal_common_thread.c 
> > b/lib/librte_eal/common/eal_common_thread.c
> > index a7ae0691bf..1cbddc4b5b 100644
> > --- a/lib/librte_eal/common/eal_common_thread.c
> > +++ b/lib/librte_eal/common/eal_common_thread.c
> > @@ -236,3 +236,36 @@ rte_ctrl_thread_create(pthread_t *thread, const char 
> > *name,
> >       pthread_join(*thread, NULL);
> >       return -ret;
> >  }
> > +
> > +void
> > +rte_thread_register(void)
> > +{
> > +     unsigned int lcore_id;
> > +     rte_cpuset_t cpuset;
> > +
> > +     /* EAL init flushes all lcores, we can't register before. */
> > +     assert(internal_config.init_complete == 1);
> > +     if (pthread_getaffinity_np(pthread_self(), sizeof(cpuset),
> > +                     &cpuset) != 0)
> > +             CPU_ZERO(&cpuset);
> > +     lcore_id = eal_lcore_non_eal_allocate();
> > +     if (lcore_id >= RTE_MAX_LCORE)
> > +             lcore_id = LCORE_ID_ANY;
> > +     rte_thread_init(lcore_id, &cpuset);
> > +     if (lcore_id != LCORE_ID_ANY)
> > +             RTE_LOG(DEBUG, EAL, "Registered non-EAL thread as lcore 
> > %u.\n",
> > +                     lcore_id);
> > +}
>
> So, in this case, the affinity of the pthread is kept and saved, in other
> words there is no link between the lcore id and the affinity. It means we
> are allowing an application to register lcores for dataplane with conflicting
> affinities.

This is not something new, applications using --lcores option already
live with this.
We have warnings in the documentation about non-EAL threads and about
the dangers of conflicting affinities.
Hopefully, the users of this API know what they are doing since they
chose not to use EAL threads.


>
> I wonder if it could be useful to have an API that automatically sets
> the affinity according to the lcore_id. Or a function that creates a
> pthread using the specified lcore id, and setting the correct affinity.
> I could simplify the work for applications that want to create/destroy
> dataplane threads dynamically.

Do you mean EAL threads dynamic creation/suppression?


>
> This could be done later however, just an idea.

For now, I don't see the need.


>
> [...]
> > diff --git a/lib/librte_eal/freebsd/eal.c b/lib/librte_eal/freebsd/eal.c
> > index 13e5de006f..32a3d999b8 100644
> > --- a/lib/librte_eal/freebsd/eal.c
> > +++ b/lib/librte_eal/freebsd/eal.c
> > @@ -424,6 +424,10 @@ rte_config_init(void)
> >               }
> >               if (rte_eal_config_reattach() < 0)
> >                       return -1;
> > +             if (!eal_mcfg_enable_multiprocess()) {
> > +                     RTE_LOG(ERR, EAL, "Primary process refused secondary 
> > attachment\n");
> > +                     return -1;
> > +             }
> >               eal_mcfg_update_internal();
> >               break;
> >       case RTE_PROC_AUTO:
> > diff --git a/lib/librte_eal/include/rte_lcore.h 
> > b/lib/librte_eal/include/rte_lcore.h
> > index 3968c40693..43747e88df 100644
> > --- a/lib/librte_eal/include/rte_lcore.h
> > +++ b/lib/librte_eal/include/rte_lcore.h
> > @@ -31,6 +31,7 @@ enum rte_lcore_role_t {
> >       ROLE_RTE,
> >       ROLE_OFF,
> >       ROLE_SERVICE,
> > +     ROLE_NON_EAL,
> >  };
>
> If find the name ROLE_NON_EAL a bit heavy (this was also my impression
> when reading the doc part).
>
> I understand that there are several types of threads:
>
> - eal (pthread created by eal): ROLE_RTE and ROLE_SERVICE
> - unregistered (pthread not created by eal, and not registered): ROLE_OFF
>   (note that ROLE_OFF also applies for unexistant threads)
> - dynamic: pthread not created by eal, but registered

Last two cases both are non-EAL threads as described in the doc so far.


>
> What about using ROLE_DYN ? I'm not sure about this name either, it's just
> to open the discussion :)
>

Well, at the moment, all those new lcores are mapped only to non-EAL threads.
A dynamic role feels like you want to take dynamic EAL threads into
account from the start.
I prefer to stick to non-EAL.


-- 
David Marchand

Reply via email to