<snip>

> 
> > >
> > > Hi Feifei,
> > >
> > > >
> > > > The variable "wrk_cmd" is a signal to control threads from running
> > > > and stopping. When worker lcores load "wrk_cmd ==
> WRK_CMD_RUN",
> > > > they
> > > start
> > > > running and when worker lcores load "wrk_cmd == WRK_CMD_STOP",
> > > they
> > > > stop.
> > > >
> > > > For the wmb in test_mt1, no storing operations must keep the order
> > > > after storing "wrk_cmd". Thus the wmb is unnecessary.
> > >
> > > I think there is a bug in my original code, we should do smp_wmb()
> > > *before* setting wrk_cmd, not after:
> > >
> > >         /* launch on all workers */
> > >         RTE_LCORE_FOREACH_WORKER(lc) {
> > >                 arg[lc].rng = r;
> > >                 arg[lc].stats = init_stat;
> > >                 rte_eal_remote_launch(test, &arg[lc], lc);
> > >         }
> > >
> > >         /* signal worker to start test */
> > > +      rte_smp_wmb();
> > >         wrk_cmd = WRK_CMD_RUN;
> > > -       rte_smp_wmb();
> > >
> > >         usleep(run_time * US_PER_S);
> > >
> > >
> > > I still think we'd better have some synchronisation here.
> > > Otherwise what would prevent compiler and/or cpu to update wrk_cmd
> > > out of order (before _init_ phase is completed)?
> > > We probably can safely assume no reordering from the compiler here,
> > > as we have function calls straight before and after 'wrk_cmd =
> WRK_CMD_RUN;'
> > > But for consistency and easier maintenance, I still think it is
> > > better to have something here, after all it is not performance critical 
> > > pass.
> > Agree that this is not performance critical.
> >
> > This is more about correctness (as usually people refer to code to
> > understand the concepts). You can refer to video [1]. Essentially, the
> > pthread_create has 'happens-before' behavior. i.e. all the memory
> > operations before the pthread_create are visible to the new thread.
> > The
> > rte_smp_rmb() barrier in the thread function is not required as it reads the
> data that was set before the thread was launched.
> 
> rte_eal_remote_launch() doesn't call pthread_create().
> All it does -  updates global variable (lcore_config) and writes/reads to/from
> the pipe.
> 
Thanks for the reminder ☹
I think rte_eal_remote_launch and rte_eal_wait_lcore need to provide behavior 
similar to pthread_launch and pthread_join respectively.

There is use of rte_smp_*mb in those functions as well. Those need to be fixed 
first and then look at these.

> >
> > I do not know why rte_smp_wmb is required. The update to 'wrk_cmd' is
> > seen by the thread eventually. rte_smp_wmb does not result in update
> being seen sooner/immediately.
> 
> We don't need it sooner.
> We need to make sure it wouldn't be seen by any worker thread before all
> workers are launched.
> To make sure all workers start the test at approximately same moment.
> That's why I think wmb() should be before 'wrk_cmd = WRK_CMD_RUN;' in
> my original code.
> 
> >
> > [1]
> >
> https://www.youtube.com/watch?t=4170&v=KeLBd2EJLOU&feature=youtu.
> be
> > >
> > > > For the rmb in test_worker, the parameters have been prepared when
> > > > worker lcores call "test_worker". It is unnessary to wait wrk_cmd
> > > > to be loaded, then the parameters can be loaded, So the rmb can be
> > > removed.
> > >
> > > It is not only about parameters loading,  it is to prevent worker
> > > core to start too early.
> > Because 'pthread_launch' provides the 'happens-before' behavior, the
> > worker core will see the updates that happened before the worker was
> launched.
> >
> > I suggest changing the commit log to provide the reasoning around
> pthread_create.
> >
> > >
> > > As I understand, your goal is to get rid of rte_smp_*() calls.
> > > Might be better to replace such places here with _atomic_ semantics.
> > > Then, as I can see, we also can get rid of 'volatile' fo wrk_cmd.
> > >
> > > > In the meanwhile, fix a typo. The note above storing "stop" into
> > > > "wrk_cmd" should be "stop test" rather than "start test".
> > > >
> > > > Signed-off-by: Feifei Wang <feifei.wa...@arm.com>
> > > > Reviewed-by: Honnappa Nagarahalli
> <honnappa.nagaraha...@arm.com>
> > > > Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com>
> > > > ---
> > > >  app/test/test_ring_stress_impl.h | 5 +----
> > > >  1 file changed, 1 insertion(+), 4 deletions(-)
> > > >
> > > > diff --git a/app/test/test_ring_stress_impl.h
> > > > b/app/test/test_ring_stress_impl.h
> > > > index f9ca63b90..384555ef9 100644
> > > > --- a/app/test/test_ring_stress_impl.h
> > > > +++ b/app/test/test_ring_stress_impl.h
> > > > @@ -198,7 +198,6 @@ test_worker(void *arg, const char *fname,
> > > > int32_t
> > > prcs)
> > > >         fill_ring_elm(&loc_elm, lc);
> > > >
> > > >         while (wrk_cmd != WRK_CMD_RUN) {
> > > > -               rte_smp_rmb();
> > > >                 rte_pause();
> > > >         }
> > > >
> > > > @@ -357,13 +356,11 @@ test_mt1(int (*test)(void *))
> > > >
> > > >         /* signal worker to start test */
> > > >         wrk_cmd = WRK_CMD_RUN;
> > > > -       rte_smp_wmb();
> > > >
> > > >         usleep(run_time * US_PER_S);
> > > >
> > > > -       /* signal worker to start test */
> > > > +       /* signal worker to stop test */
> > > >         wrk_cmd = WRK_CMD_STOP;
> > > > -       rte_smp_wmb();
> > > >
> > > >         /* wait for workers and collect stats. */
> > > >         mc = rte_lcore_id();
> > > > --
> > > > 2.17.1

Reply via email to