On Tue, Mar 10, 2020 at 2:32 PM Harry van Haaren <harry.van.haa...@intel.com> wrote: > > This commit releases all service cores from thier role, > returning them to ROLE_RTE on rte_service_finalize(). > > This may fix an issue relating to the service cores causing > a race-condition on eal_cleanup(), where the service core > could still be executing while the main thread has already > free-d the service memory, leading to a segfault.
Adding rte_service_lcore_reset_all() just tells a (remaining) service lcore to quit its loop, but does not close the race on lcore_states. The backtrace shows the same. (gdb) bt full #0 rte_service_runner_func (arg=<optimized out>) at ../lib/librte_eal/common/rte_service.c:455 service_mask = 1 i = <optimized out> lcore = 1 cs = 0x1003ea200 #1 0x00007ffff72030ef in eal_thread_loop (arg=<optimized out>) at ../lib/librte_eal/linux/eal/eal_thread.c:153 fct_arg = <optimized out> c = 0 '\000' n = <optimized out> ret = <optimized out> lcore_id = <optimized out> thread_id = 140737203603200 m2s = 14 s2m = 22 cpuset = "1", '\000' <repeats 175 times>, "\200\000\000\000\000\000\000\000\221\354e\360\377\177", '\000' <repeats 65 times> __func__ = "eal_thread_loop" #2 0x00007ffff065ddd5 in start_thread () from /lib64/libpthread.so.0 No symbol table info available. #3 0x00007ffff038702d in clone () from /lib64/libc.so.6 No symbol table info available. I added a rte_eal_mp_wait_lcore(), to ensure that each service lcore _did_ quit its loop. @@ -123,6 +123,7 @@ rte_service_finalize(void) return; rte_service_lcore_reset_all(); + rte_eal_mp_wait_lcore(); rte_free(rte_services); rte_free(lcore_states); I can't reproduce with this. -- David Marchand