On Fri, 29 Jan 2021 03:17:50 +0000 Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> wrote:
> <snip> > > > > > > > > > > > Hi Feifei, > > > > > > > > > > > > > > The variable "wrk_cmd" is a signal to control threads from running > > > > > and stopping. When worker lcores load "wrk_cmd == > > WRK_CMD_RUN", > > > > > they > > > > start > > > > > running and when worker lcores load "wrk_cmd == WRK_CMD_STOP", > > > > they > > > > > stop. > > > > > > > > > > For the wmb in test_mt1, no storing operations must keep the order > > > > > after storing "wrk_cmd". Thus the wmb is unnecessary. > > > > > > > > I think there is a bug in my original code, we should do smp_wmb() > > > > *before* setting wrk_cmd, not after: > > > > > > > > /* launch on all workers */ > > > > RTE_LCORE_FOREACH_WORKER(lc) { > > > > arg[lc].rng = r; > > > > arg[lc].stats = init_stat; > > > > rte_eal_remote_launch(test, &arg[lc], lc); > > > > } > > > > > > > > /* signal worker to start test */ > > > > + rte_smp_wmb(); > > > > wrk_cmd = WRK_CMD_RUN; > > > > - rte_smp_wmb(); > > > > > > > > usleep(run_time * US_PER_S); > > > > > > > > > > > > I still think we'd better have some synchronisation here. > > > > Otherwise what would prevent compiler and/or cpu to update wrk_cmd > > > > out of order (before _init_ phase is completed)? > > > > We probably can safely assume no reordering from the compiler here, > > > > as we have function calls straight before and after 'wrk_cmd = > > WRK_CMD_RUN;' > > > > But for consistency and easier maintenance, I still think it is > > > > better to have something here, after all it is not performance critical > > > > pass. > > > Agree that this is not performance critical. > > > > > > This is more about correctness (as usually people refer to code to > > > understand the concepts). You can refer to video [1]. Essentially, the > > > pthread_create has 'happens-before' behavior. i.e. all the memory > > > operations before the pthread_create are visible to the new thread. > > > The > > > rte_smp_rmb() barrier in the thread function is not required as it reads > > > the > > data that was set before the thread was launched. > > > > rte_eal_remote_launch() doesn't call pthread_create(). > > All it does - updates global variable (lcore_config) and writes/reads > > to/from > > the pipe. > > > Thanks for the reminder ☹ > I think rte_eal_remote_launch and rte_eal_wait_lcore need to provide behavior > similar to pthread_launch and pthread_join respectively. > > There is use of rte_smp_*mb in those functions as well. Those need to be > fixed first and then look at these. Looks like you want __atomic_thread_fence() here.