David Marchand <david.march...@redhat.com> writes: > On Wed, Nov 27, 2019 at 3:16 PM Van Haaren, Harry > <harry.van.haa...@intel.com> wrote: >> >> > -----Original Message----- >> > From: Aaron Conole <acon...@redhat.com> >> > Sent: Wednesday, November 27, 2019 2:10 PM >> > To: Van Haaren, Harry <harry.van.haa...@intel.com> >> > Cc: dev@dpdk.org >> > Subject: Re: [PATCH] test/service: fix wait for service core >> > >> > Harry van Haaren <harry.van.haa...@intel.com> writes: >> > >> > > This commit fixes a sporadic failure of the service_autotest >> > > unit test, as seen in the DPDK CI. The failure occurs as the main test >> > > thread did not wait on the service-thread to return, and allowing it >> > > to read a flag before the service was able to write to it. >> > > >> > > The fix changes the wait API call to specific the service-core ID, >> > > and this waits for cores with both ROLE_RTE and ROLE_SERVICE. >> > > >> > > The rte_eal_mp_wait_lcore() call does not (and should not) wait >> > > for service cores, so must not be used to wait on service-cores. >> > > >> > > Fixes: f038a81e1c56 ("service: add unit tests") >> > > >> > > Reported-by: Aaron Conole <acon...@redhat.com> >> > > Signed-off-by: Harry van Haaren <harry.van.haa...@intel.com> >> > > >> > > --- >> > >> > It might also be good to document this behavior in the API area. It's >> > unclear that the lcore wait function which takes a core id will work, >> > but the broad wait will not. >> >> Yes agreed that docs can improve here - different patch. >> >> >> > > Given this is a fix in the unit test, and not a functional change >> > > I'm not sure its worth backporting to LTS / stable releases? >> > > I've not added stable on CC yet. >> > >> > I think it's worth it if the LTS / stable branches use the unit tests >> > (otherwise, they will observe sporadic failures). >> >> Ok, I've added sta...@dpdk.org on CC now >> >> >> > > app/test/test_service_cores.c | 2 +- >> > > 1 file changed, 1 insertion(+), 1 deletion(-) >> > > >> > > diff --git a/app/test/test_service_cores.c >> > > b/app/test/test_service_cores.c >> > > index 9fe38f5e0..a922c7ddc 100644 >> > > --- a/app/test/test_service_cores.c >> > > +++ b/app/test/test_service_cores.c >> > > @@ -483,7 +483,7 @@ service_lcore_en_dis_able(void) >> > > int ret = rte_eal_remote_launch(service_remote_launch_func, NULL, >> > > slcore_id); >> > > TEST_ASSERT_EQUAL(0, ret, "Ex-service core remote launch failed."); >> > > - rte_eal_mp_wait_lcore(); >> > > + rte_eal_wait_lcore(slcore_id); >> > > TEST_ASSERT_EQUAL(1, service_remote_launch_flag, >> > > "Ex-service core function call had no effect."); >> > >> > Should we also have some change like the following (just a guess): >> > >> > diff --git a/app/test/test_service_cores.c b/app/test/test_service_cores.c >> > index 9fe38f5e08..695c35ac6c 100644 >> > --- a/app/test/test_service_cores.c >> > +++ b/app/test/test_service_cores.c >> > @@ -773,7 +773,7 @@ service_app_lcore_poll_impl(const int mt_safe) >> > >> > /* flag done, then wait for the spawned 2nd core to return */ >> > params[0] = 1; >> > - rte_eal_mp_wait_lcore(); >> > + rte_eal_wait_lcore(app_core2); >> > >> > /* core two gets launched first - and should hold the service lock */ >> > TEST_ASSERT_EQUAL(0, app_core2_ret, >> >> >> I reviewed this usage of the function, and I believe it waits on application >> cores (aka, ROLE_RTE, not ROLE_SERVICE). Hence this usage is actually >> correct. >> Please review and double check my logic though - more eyes is good. >> > > I will check it later tonight but I am for taking this in 19.11 if we > can get more stable tests. > Aaron, do you have an objection?
No objection