On Tue, Jun 30, 2020 at 12:07 PM Olivier Matz <olivier.m...@6wind.com> wrote: > > On Fri, Jun 26, 2020 at 04:47:33PM +0200, David Marchand wrote: > > DPDK allows calling some part of its API from a non-EAL thread but this > > has some limitations. > > OVS (and other applications) has its own thread management but still > > want to avoid such limitations by hacking RTE_PER_LCORE(_lcore_id) and > > faking EAL threads potentially unknown of some DPDK component. > > > > Introduce a new API to register non-EAL thread and associate them to a > > free lcore with a new NON_EAL role. > > This role denotes lcores that do not run DPDK mainloop and as such > > prevents use of rte_eal_wait_lcore() and consorts. > > > > Multiprocess is not supported as the need for cohabitation with this new > > feature is unclear at the moment. > > > > Signed-off-by: David Marchand <david.march...@redhat.com> > > Acked-by: Andrew Rybchenko <arybche...@solarflare.com> > > --- > > Changes since v2: > > - refused multiprocess init once rte_thread_register got called, and > > vice versa, > > - added warning on multiprocess in rte_thread_register doxygen, > > > > Changes since v1: > > - moved cleanup on lcore role code in patch 5, > > - added unit test, > > - updated documentation, > > - changed naming from "external thread" to "registered non-EAL thread" > > > > --- > > MAINTAINERS | 1 + > > app/test/Makefile | 1 + > > app/test/autotest_data.py | 6 + > > app/test/meson.build | 2 + > > app/test/test_lcores.c | 139 ++++++++++++++++++ > > doc/guides/howto/debug_troubleshoot.rst | 5 +- > > .../prog_guide/env_abstraction_layer.rst | 22 +-- > > doc/guides/prog_guide/mempool_lib.rst | 2 +- > > lib/librte_eal/common/eal_common_lcore.c | 50 ++++++- > > lib/librte_eal/common/eal_common_mcfg.c | 36 +++++ > > lib/librte_eal/common/eal_common_thread.c | 33 +++++ > > lib/librte_eal/common/eal_memcfg.h | 10 ++ > > lib/librte_eal/common/eal_private.h | 18 +++ > > lib/librte_eal/freebsd/eal.c | 4 + > > lib/librte_eal/include/rte_lcore.h | 25 +++- > > lib/librte_eal/linux/eal.c | 4 + > > lib/librte_eal/rte_eal_version.map | 2 + > > lib/librte_mempool/rte_mempool.h | 11 +- > > 18 files changed, 349 insertions(+), 22 deletions(-) > > create mode 100644 app/test/test_lcores.c > > > > [...] > > > diff --git a/app/test/test_lcores.c b/app/test/test_lcores.c > > new file mode 100644 > > index 0000000000..864bcbade7 > > --- /dev/null > > +++ b/app/test/test_lcores.c > > @@ -0,0 +1,139 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * Copyright (c) 2020 Red Hat, Inc. > > + */ > > + > > +#include <pthread.h> > > +#include <string.h> > > + > > +#include <rte_lcore.h> > > + > > +#include "test.h" > > + > > +struct thread_context { > > + enum { INIT, ERROR, DONE } state; > > + bool lcore_id_any; > > + pthread_t id; > > + unsigned int *registered_count; > > +}; > > +static void *thread_loop(void *arg) > > +{ > > missing an empty line here > > > + struct thread_context *t = arg; > > + unsigned int lcore_id; > > + > > + lcore_id = rte_lcore_id(); > > + if (lcore_id != LCORE_ID_ANY) { > > + printf("Incorrect lcore id for new thread %u\n", lcore_id); > > + t->state = ERROR; > > + } > > + rte_thread_register(); > > + lcore_id = rte_lcore_id(); > > + if ((t->lcore_id_any && lcore_id != LCORE_ID_ANY) || > > + (!t->lcore_id_any && lcore_id == LCORE_ID_ANY)) { > > + printf("Could not register new thread, got %u while > > %sexpecting %u\n", > > + lcore_id, t->lcore_id_any ? "" : "not ", > > LCORE_ID_ANY); > > + t->state = ERROR; > > + } > > To check if rte_thread_register() succedeed, we need to look at > lcore_id. I wonder if rte_thread_register() shouldn't return the lcore > id on success, and -1 on error (rte_errno could be set to give some > info on the error).
lcore_id are unsigned integers with the special value LCORE_ID_ANY mapped to UINT32_MAX (should be UINT_MAX? anyway...). rte_thread_register could return an error code as there are no ERROR level logs about why a lcore allocation failed. We could then distinguish a shortage of lcore (or init callback refusal) from an invalid call before rte_eal_init() or when mp is in use. About returning the lcore_id as part of the return code, this would map to -1 for LCORE_ID_ANY. This is probably not a problem but still odd. > > The same could be done for rte_thread_init() ? Not sure where this one could fail. > > [...] > > > diff --git a/lib/librte_eal/common/eal_common_thread.c > > b/lib/librte_eal/common/eal_common_thread.c > > index a7ae0691bf..1cbddc4b5b 100644 > > --- a/lib/librte_eal/common/eal_common_thread.c > > +++ b/lib/librte_eal/common/eal_common_thread.c > > @@ -236,3 +236,36 @@ rte_ctrl_thread_create(pthread_t *thread, const char > > *name, > > pthread_join(*thread, NULL); > > return -ret; > > } > > + > > +void > > +rte_thread_register(void) > > +{ > > + unsigned int lcore_id; > > + rte_cpuset_t cpuset; > > + > > + /* EAL init flushes all lcores, we can't register before. */ > > + assert(internal_config.init_complete == 1); > > + if (pthread_getaffinity_np(pthread_self(), sizeof(cpuset), > > + &cpuset) != 0) > > + CPU_ZERO(&cpuset); > > + lcore_id = eal_lcore_non_eal_allocate(); > > + if (lcore_id >= RTE_MAX_LCORE) > > + lcore_id = LCORE_ID_ANY; > > + rte_thread_init(lcore_id, &cpuset); > > + if (lcore_id != LCORE_ID_ANY) > > + RTE_LOG(DEBUG, EAL, "Registered non-EAL thread as lcore > > %u.\n", > > + lcore_id); > > +} > > So, in this case, the affinity of the pthread is kept and saved, in other > words there is no link between the lcore id and the affinity. It means we > are allowing an application to register lcores for dataplane with conflicting > affinities. This is not something new, applications using --lcores option already live with this. We have warnings in the documentation about non-EAL threads and about the dangers of conflicting affinities. Hopefully, the users of this API know what they are doing since they chose not to use EAL threads. > > I wonder if it could be useful to have an API that automatically sets > the affinity according to the lcore_id. Or a function that creates a > pthread using the specified lcore id, and setting the correct affinity. > I could simplify the work for applications that want to create/destroy > dataplane threads dynamically. Do you mean EAL threads dynamic creation/suppression? > > This could be done later however, just an idea. For now, I don't see the need. > > [...] > > diff --git a/lib/librte_eal/freebsd/eal.c b/lib/librte_eal/freebsd/eal.c > > index 13e5de006f..32a3d999b8 100644 > > --- a/lib/librte_eal/freebsd/eal.c > > +++ b/lib/librte_eal/freebsd/eal.c > > @@ -424,6 +424,10 @@ rte_config_init(void) > > } > > if (rte_eal_config_reattach() < 0) > > return -1; > > + if (!eal_mcfg_enable_multiprocess()) { > > + RTE_LOG(ERR, EAL, "Primary process refused secondary > > attachment\n"); > > + return -1; > > + } > > eal_mcfg_update_internal(); > > break; > > case RTE_PROC_AUTO: > > diff --git a/lib/librte_eal/include/rte_lcore.h > > b/lib/librte_eal/include/rte_lcore.h > > index 3968c40693..43747e88df 100644 > > --- a/lib/librte_eal/include/rte_lcore.h > > +++ b/lib/librte_eal/include/rte_lcore.h > > @@ -31,6 +31,7 @@ enum rte_lcore_role_t { > > ROLE_RTE, > > ROLE_OFF, > > ROLE_SERVICE, > > + ROLE_NON_EAL, > > }; > > If find the name ROLE_NON_EAL a bit heavy (this was also my impression > when reading the doc part). > > I understand that there are several types of threads: > > - eal (pthread created by eal): ROLE_RTE and ROLE_SERVICE > - unregistered (pthread not created by eal, and not registered): ROLE_OFF > (note that ROLE_OFF also applies for unexistant threads) > - dynamic: pthread not created by eal, but registered Last two cases both are non-EAL threads as described in the doc so far. > > What about using ROLE_DYN ? I'm not sure about this name either, it's just > to open the discussion :) > Well, at the moment, all those new lcores are mapped only to non-EAL threads. A dynamic role feels like you want to take dynamic EAL threads into account from the start. I prefer to stick to non-EAL. -- David Marchand