Hi Stanislaw,

W dniu 26.04.2021 o 18:33, Stanislaw Kardach pisze:
> While working on RISC-V port I have encountered a situation where worker
> threads get stuck in the rte_distributor_return_pkt() function in the
> burst test.
> After investigation some of the threads enter this function with
> flag RTE_DISTRIB_GET_BUF set in the d->retptr64[0]. At the same time
> main thread has already passed rte_distributor_process() so nobody will
> clear this flag and hence workers can't return.
>
> What I've noticed is that adding a flush just after the last _process(),
> similarly to how quit_workers() function is written in the
> test_distributor.c fixes the issue.
> Additionally the issue disappears when I remove the rdtsc delay code
> inside the rte_distributor_request_pkt().
> However I can't get this to reproduce on x86 (even with SIMD forced
> off) and with artificial delays, which is why I wonder whether I'm not
> actually hiding some other issue.
I was able to reproduce the issue on x86 arch using VM with 32 emulated 
CPU cores.
I guess it would be enough to just have more than 8 worker threads, so 
not all of them would be awaken.
>
> Looking at the implementation of the distributor, it is based on
> __atomic_* builtins and the only platform related bit in the fast-path
> is the rte_rdtsc() and rte_pause(). There may be some issues in the
> toolchain (I've tried so far with the Ubuntu one - 10.2.0-8ubuntu1).
> I should add that all unit tests for distributor are passing so either
> there's some coverage corner case or the implementation works on RISC-V.
> As for RDTSC I'm using a sleep-stable time counter with 1MHz frequency
> and switching to high resolution cycle counter also removes the issue
> but that's the same as removing the rdtsc delay as mentioned above.
>
> I'd love to hear from You if this fix makes any sense.
Yes your patch fixes the issue and is perfectly fine.
>
> While modifying this test, I've also pulled in a fix from
> test_distributor.c which ensures that each thread gets his own wakeup
> packet as it's possible that when sending a burst of packets, they won't
> be spread over all the workers.
>
> Signed-off-by: Stanislaw Kardach <k...@semihalf.com>
> Fixes: 7c3287a10535 ("test/distributor: add performance test for burst mode")
> Cc: david.h...@intel.com
> Cc: l.wojciec...@partner.samsung.com
> Cc: David Marchand <david.march...@redhat.com>
> ---
>   app/test/test_distributor_perf.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/app/test/test_distributor_perf.c 
> b/app/test/test_distributor_perf.c
> index b25f79a34..fdbeae6d2 100644
> --- a/app/test/test_distributor_perf.c
> +++ b/app/test/test_distributor_perf.c
> @@ -188,13 +188,15 @@ quit_workers(struct rte_distributor *d, struct 
> rte_mempool *p)
>       rte_mempool_get_bulk(p, (void *)bufs, num_workers);
>   
>       quit = 1;
> -     for (i = 0; i < num_workers; i++)
> +     for (i = 0; i < num_workers; i++) {
>               bufs[i]->hash.usr = i << 1;
> -     rte_distributor_process(d, bufs, num_workers);
> +             rte_distributor_process(d, &bufs[i], 1);
> +     }
>   
>       rte_mempool_put_bulk(p, (void *)bufs, num_workers);
>   
>       rte_distributor_process(d, NULL, 0);
> +     rte_distributor_flush(d);
>       rte_eal_mp_wait_lcore();
>       quit = 0;
>       worker_idx = 0;
     Tested-by: Lukasz Wojciechowski <l.wojciec...@partner.samsung.com>
     Reviewed-by: Lukasz Wojciechowski <l.wojciec...@partner.samsung.com>

-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciec...@partner.samsung.com

Reply via email to