Harry van Haaren <harry.van.haa...@intel.com> writes: > This commit fixes a sporadic failure of the service_autotest > unit test, as seen in the DPDK CI. The failure occurs as the main test > thread did not wait on the service-thread to return, and allowing it > to read a flag before the service was able to write to it. > > The fix changes the wait API call to specific the service-core ID, > and this waits for cores with both ROLE_RTE and ROLE_SERVICE. > > The rte_eal_mp_wait_lcore() call does not (and should not) wait > for service cores, so must not be used to wait on service-cores. > > Fixes: f038a81e1c56 ("service: add unit tests") > > Reported-by: Aaron Conole <acon...@redhat.com> > Signed-off-by: Harry van Haaren <harry.van.haa...@intel.com> > > ---
It might also be good to document this behavior in the API area. It's unclear that the lcore wait function which takes a core id will work, but the broad wait will not. > Given this is a fix in the unit test, and not a functional change > I'm not sure its worth backporting to LTS / stable releases? > I've not added stable on CC yet. I think it's worth it if the LTS / stable branches use the unit tests (otherwise, they will observe sporadic failures). > --- > app/test/test_service_cores.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/app/test/test_service_cores.c b/app/test/test_service_cores.c > index 9fe38f5e0..a922c7ddc 100644 > --- a/app/test/test_service_cores.c > +++ b/app/test/test_service_cores.c > @@ -483,7 +483,7 @@ service_lcore_en_dis_able(void) > int ret = rte_eal_remote_launch(service_remote_launch_func, NULL, > slcore_id); > TEST_ASSERT_EQUAL(0, ret, "Ex-service core remote launch failed."); > - rte_eal_mp_wait_lcore(); > + rte_eal_wait_lcore(slcore_id); > TEST_ASSERT_EQUAL(1, service_remote_launch_flag, > "Ex-service core function call had no effect."); Should we also have some change like the following (just a guess): diff --git a/app/test/test_service_cores.c b/app/test/test_service_cores.c index 9fe38f5e08..695c35ac6c 100644 --- a/app/test/test_service_cores.c +++ b/app/test/test_service_cores.c @@ -773,7 +773,7 @@ service_app_lcore_poll_impl(const int mt_safe) /* flag done, then wait for the spawned 2nd core to return */ params[0] = 1; - rte_eal_mp_wait_lcore(); + rte_eal_wait_lcore(app_core2); /* core two gets launched first - and should hold the service lock */ TEST_ASSERT_EQUAL(0, app_core2_ret,