Assume thread T2 is a service lcore that is in the middle of executing
a service function.  Also, assume thread T1 concurrently calls
rte_service_lcore_stop(), which will set the "service_active_on_lcore"
state to false.  If thread T1 then calls rte_service_may_be_active(),
it can return zero even though T2 is still running the service function.
If T1 then proceeds to free data being used by T2, a crash can ensue.

Move the logic that clears the "service_active_on_lcore" state from the
rte_service_lcore_stop() function to the service_runner_func() to
ensure that we:
- don't let the "service_active_on_lcore" state linger as 1
- don't clear the state early

Fixes: 6550113be62d ("service: fix lingering active status")
Cc: sta...@dpdk.org

Signed-off-by: Erik Gabriel Carrillo <erik.g.carri...@intel.com>
---
 lib/eal/common/rte_service.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
index 81c9514149..bcc2e19077 100644
--- a/lib/eal/common/rte_service.c
+++ b/lib/eal/common/rte_service.c
@@ -479,6 +479,7 @@ static int32_t
 service_runner_func(void *arg)
 {
        RTE_SET_USED(arg);
+       uint8_t i;
        const int lcore = rte_lcore_id();
        struct core_state *cs = &lcore_states[lcore];
 
@@ -494,7 +495,6 @@ service_runner_func(void *arg)
                const uint64_t service_mask = cs->service_mask;
                uint8_t start_id;
                uint8_t end_id;
-               uint8_t i;
 
                if (service_mask == 0)
                        continue;
@@ -510,6 +510,12 @@ service_runner_func(void *arg)
                __atomic_store_n(&cs->loops, cs->loops + 1, __ATOMIC_RELAXED);
        }
 
+       /* Switch off this core for all services, to ensure that future
+        * calls to may_be_active() know this core is switched off.
+        */
+       for (i = 0; i < RTE_SERVICE_NUM_MAX; i++)
+               cs->service_active_on_lcore[i] = 0;
+
        /* Use SEQ CST memory ordering to avoid any re-ordering around
         * this store, ensuring that once this store is visible, the service
         * lcore thread really is done in service cores code.
@@ -806,11 +812,6 @@ rte_service_lcore_stop(uint32_t lcore)
                        __atomic_load_n(&rte_services[i].num_mapped_cores,
                                __ATOMIC_RELAXED));
 
-               /* Switch off this core for all services, to ensure that future
-                * calls to may_be_active() know this core is switched off.
-                */
-               cs->service_active_on_lcore[i] = 0;
-
                /* if the core is mapped, and the service is running, and this
                 * is the only core that is mapped, the service would cease to
                 * run if this core stopped, so fail instead.
-- 
2.23.0

Reply via email to