>This patch fixes RTE SMP barrier bugs for the perf test of eventdev. > >For the "perf_process_last_stage" function, wmb after storing >processed_pkts should be moved before it. This is because the worker >lcore should ensure it has really finished data processing, e.g. event >stored into buffers, before the shared variables "w- >>processed_pkts"are >stored. > >For the "perf_process_last_stage_latency", on the one hand, the wmb >should be moved before storing into "w->processed_pkts". The reason >is >the same as above. But on the other hand, for "w->latency", wmb is >unnecessary due to data dependency. > >Fixes: 2369f73329f8 ("app/testeventdev: add perf queue worker >functions") >Cc: jer...@marvell.com >Cc: sta...@dpdk.org > >Signed-off-by: Feifei Wang <feifei.wa...@arm.com> >Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com>
Acked-by: Pavan Nikhilesh <pbhagavat...@marvell.com> >--- > app/test-eventdev/test_perf_common.h | 14 ++++++++++++-- > 1 file changed, 12 insertions(+), 2 deletions(-) > >diff --git a/app/test-eventdev/test_perf_common.h b/app/test- >eventdev/test_perf_common.h >index ff9705df8..e7233e5a5 100644 >--- a/app/test-eventdev/test_perf_common.h >+++ b/app/test-eventdev/test_perf_common.h >@@ -97,8 +97,13 @@ perf_process_last_stage(struct rte_mempool >*const pool, > void *bufs[], int const buf_sz, uint8_t count) > { > bufs[count++] = ev->event_ptr; >- w->processed_pkts++; >+ >+ /* wmb here ensures event_prt is stored before >+ * updating the number of processed packets >+ * for worker lcores >+ */ > rte_smp_wmb(); >+ w->processed_pkts++; > > if (unlikely(count == buf_sz)) { > count = 0; >@@ -116,6 +121,12 @@ perf_process_last_stage_latency(struct >rte_mempool *const pool, > struct perf_elt *const m = ev->event_ptr; > > bufs[count++] = ev->event_ptr; >+ >+ /* wmb here ensures event_prt is stored before >+ * updating the number of processed packets >+ * for worker lcores >+ */ >+ rte_smp_wmb(); > w->processed_pkts++; > > if (unlikely(count == buf_sz)) { >@@ -127,7 +138,6 @@ perf_process_last_stage_latency(struct >rte_mempool *const pool, > } > > w->latency += latency; >- rte_smp_wmb(); > return count; > } > >-- >2.25.1