<snip> > > > > > > > Hi Feifei, > > > > > > > > > > > The variable "wrk_cmd" is a signal to control threads from running > > > > and stopping. When worker lcores load "wrk_cmd == > WRK_CMD_RUN", > > > > they > > > start > > > > running and when worker lcores load "wrk_cmd == WRK_CMD_STOP", > > > they > > > > stop. > > > > > > > > For the wmb in test_mt1, no storing operations must keep the order > > > > after storing "wrk_cmd". Thus the wmb is unnecessary. > > > > > > I think there is a bug in my original code, we should do smp_wmb() > > > *before* setting wrk_cmd, not after: > > > > > > /* launch on all workers */ > > > RTE_LCORE_FOREACH_WORKER(lc) { > > > arg[lc].rng = r; > > > arg[lc].stats = init_stat; > > > rte_eal_remote_launch(test, &arg[lc], lc); > > > } > > > > > > /* signal worker to start test */ > > > + rte_smp_wmb(); > > > wrk_cmd = WRK_CMD_RUN; > > > - rte_smp_wmb(); > > > > > > usleep(run_time * US_PER_S); > > > > > > > > > I still think we'd better have some synchronisation here. > > > Otherwise what would prevent compiler and/or cpu to update wrk_cmd > > > out of order (before _init_ phase is completed)? > > > We probably can safely assume no reordering from the compiler here, > > > as we have function calls straight before and after 'wrk_cmd = > WRK_CMD_RUN;' > > > But for consistency and easier maintenance, I still think it is > > > better to have something here, after all it is not performance critical > > > pass. > > Agree that this is not performance critical. > > > > This is more about correctness (as usually people refer to code to > > understand the concepts). You can refer to video [1]. Essentially, the > > pthread_create has 'happens-before' behavior. i.e. all the memory > > operations before the pthread_create are visible to the new thread. > > The > > rte_smp_rmb() barrier in the thread function is not required as it reads the > data that was set before the thread was launched. > > rte_eal_remote_launch() doesn't call pthread_create(). > All it does - updates global variable (lcore_config) and writes/reads to/from > the pipe. > Thanks for the reminder ☹ I think rte_eal_remote_launch and rte_eal_wait_lcore need to provide behavior similar to pthread_launch and pthread_join respectively.
There is use of rte_smp_*mb in those functions as well. Those need to be fixed first and then look at these. > > > > I do not know why rte_smp_wmb is required. The update to 'wrk_cmd' is > > seen by the thread eventually. rte_smp_wmb does not result in update > being seen sooner/immediately. > > We don't need it sooner. > We need to make sure it wouldn't be seen by any worker thread before all > workers are launched. > To make sure all workers start the test at approximately same moment. > That's why I think wmb() should be before 'wrk_cmd = WRK_CMD_RUN;' in > my original code. > > > > > [1] > > > https://www.youtube.com/watch?t=4170&v=KeLBd2EJLOU&feature=youtu. > be > > > > > > > For the rmb in test_worker, the parameters have been prepared when > > > > worker lcores call "test_worker". It is unnessary to wait wrk_cmd > > > > to be loaded, then the parameters can be loaded, So the rmb can be > > > removed. > > > > > > It is not only about parameters loading, it is to prevent worker > > > core to start too early. > > Because 'pthread_launch' provides the 'happens-before' behavior, the > > worker core will see the updates that happened before the worker was > launched. > > > > I suggest changing the commit log to provide the reasoning around > pthread_create. > > > > > > > > As I understand, your goal is to get rid of rte_smp_*() calls. > > > Might be better to replace such places here with _atomic_ semantics. > > > Then, as I can see, we also can get rid of 'volatile' fo wrk_cmd. > > > > > > > In the meanwhile, fix a typo. The note above storing "stop" into > > > > "wrk_cmd" should be "stop test" rather than "start test". > > > > > > > > Signed-off-by: Feifei Wang <feifei.wa...@arm.com> > > > > Reviewed-by: Honnappa Nagarahalli > <honnappa.nagaraha...@arm.com> > > > > Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com> > > > > --- > > > > app/test/test_ring_stress_impl.h | 5 +---- > > > > 1 file changed, 1 insertion(+), 4 deletions(-) > > > > > > > > diff --git a/app/test/test_ring_stress_impl.h > > > > b/app/test/test_ring_stress_impl.h > > > > index f9ca63b90..384555ef9 100644 > > > > --- a/app/test/test_ring_stress_impl.h > > > > +++ b/app/test/test_ring_stress_impl.h > > > > @@ -198,7 +198,6 @@ test_worker(void *arg, const char *fname, > > > > int32_t > > > prcs) > > > > fill_ring_elm(&loc_elm, lc); > > > > > > > > while (wrk_cmd != WRK_CMD_RUN) { > > > > - rte_smp_rmb(); > > > > rte_pause(); > > > > } > > > > > > > > @@ -357,13 +356,11 @@ test_mt1(int (*test)(void *)) > > > > > > > > /* signal worker to start test */ > > > > wrk_cmd = WRK_CMD_RUN; > > > > - rte_smp_wmb(); > > > > > > > > usleep(run_time * US_PER_S); > > > > > > > > - /* signal worker to start test */ > > > > + /* signal worker to stop test */ > > > > wrk_cmd = WRK_CMD_STOP; > > > > - rte_smp_wmb(); > > > > > > > > /* wait for workers and collect stats. */ > > > > mc = rte_lcore_id(); > > > > -- > > > > 2.17.1