On 2024-12-31 11:02, Piotr Krzewinski wrote:
Add option to register a callback running on service lcores
along regular services, which gets information about the service loop.
It enables doing maintenance work or power saving during periods when
all registered services are idling.

As an example application that is doing dequeue from multiple
event ports using single service lcore (e.g. using rte dispatcher library)
may want to wait for new events inside the maintenance callback when
there is no work available on ANY of the ports.
This is not possible using non zero dequeue timeout without increasing
latency of work that is scheduled to other event ports.


If the purpose of this mechanism is to allow user-defined power management, we should try to find a more specific name. In a UNIX kernel, this kind of thing happens in the "idle loop" (or "idle task"). The user would be responsible for implementing the "idle governor" (to use Linux terminology).

"idle hook", "idle callback", or "idle handler" maybe.

If the mechanism is supposed to be more generic, it could be just named "iteration hook/callback". In such a case, all result codes for all the services should be presented to the user (via the callback).

Including "iteration" in the name ("pass", or "round") would indicate when it's being called.

A bigger question than naming is that if the idle loop really belongs in the user app. Intuitively, it seems like this logic (user-defined or not) should be in the work scheduler (e.g., a DPDK eventdev). It may well be in a better position to make the decision on if/how long to put the lcore to sleep (especially if fed with app-specific latency configuration).

For an app using both eventdev+dispatcher lib and *other* non-trivial RTE services, the issue is really that the work scheduler (i.e., the event device) does not know about all work being performed.

That said, a solution to that larger issue likely involves some extensive rework of such an app, and potentially DPDK changes as well. The kind of callback suggested in this RFC may well serve as a stop gap solution which allows the implementation of some basic power management support.

In the light of we (or at least I) don't really know what we are doing here, maybe it's better to have this as a pure "iteration hook/callback", without any particular opionion on how it should be used.

Such a solution, with arrays of service call result codes and service ids, would come with a little bit more complexity/overhead.

Stephen and Jerin, your input would be greatly appreciated on this matter. Especially the "bigger picture" question.

Signed-off-by: Piotr Krzewinski <piotr.krzewin...@ericsson.com>
---
v2:
* Fixed unittest and doxygen issues
---
  app/test/test_service_cores.c | 54 +++++++++++++++++++++++++++++++++++
  lib/eal/common/rte_service.c  | 52 ++++++++++++++++++++++++++++-----
  lib/eal/include/rte_service.h | 34 ++++++++++++++++++++++
  lib/eal/version.map           |  4 +++
  4 files changed, 137 insertions(+), 7 deletions(-)


The existence of this new API should probably be touched upon in the user guide as well. And the release change log.

diff --git a/app/test/test_service_cores.c b/app/test/test_service_cores.c
index 46ed9a1868..f4abbd2dfe 100644
--- a/app/test/test_service_cores.c
+++ b/app/test/test_service_cores.c
@@ -1011,6 +1011,59 @@ service_may_be_active(void)
        return unregister_all();
  }
+static uint64_t busy_loops;
+static uint64_t idle_loops;
+
+static void maint_callback(bool work_done)
+{
+       if (work_done)
+               busy_loops++;
+       else
+               idle_loops++;
+}
+
+/* test registering and unregistering of service maint callback*/
+static int
+service_maintenance(void)
+{
+       const uint32_t sid = 0;
+       busy_loops = 0;
+       idle_loops = 0;
+       /* expected failure cases */
+       TEST_ASSERT_EQUAL(-EINVAL, rte_service_may_be_active(10000),
+                       "Invalid service may be active check did not fail");
+
+       /* start the service */
+       TEST_ASSERT_EQUAL(0, rte_service_runstate_set(sid, 1),
+                       "Starting valid service failed");
+       TEST_ASSERT_EQUAL(0, rte_service_lcore_add(slcore_id),
+                       "Add service core failed when not in use before");
+       TEST_ASSERT_EQUAL(0, rte_service_map_lcore_set(sid, slcore_id, 1),
+                       "Enabling valid service on valid core failed");
+       TEST_ASSERT_EQUAL(0, 
rte_service_maint_callback_register(maint_callback, slcore_id),
+                       "Registering maintenance callback failed");
+       TEST_ASSERT_EQUAL(0, rte_service_lcore_start(slcore_id),
+                       "Service core start after add failed");
+
+       /* ensures core really is running the service function */
+       TEST_ASSERT_EQUAL(1, service_lcore_running_check(),
+                       "Service core expected to poll service but it didn't");
+
+       /* ensures service maintenance callback received information about work 
done */
+       TEST_ASSERT(busy_loops > 0, "No busy loops detected");
+       TEST_ASSERT(idle_loops > 0, "No idle loops detected");
+
+       /* stop the service, and wait for not-active with timeout */
+       TEST_ASSERT_EQUAL(0, rte_service_runstate_set(sid, 0),
+                       "Error: Service stop returned non-zero");
+       TEST_ASSERT_EQUAL(0, service_ensure_stopped_with_timeout(sid),
+                       "Error: Service not stopped after timeout period.");
+       TEST_ASSERT_EQUAL(0, rte_service_maint_callback_unregister(slcore_id),
+                       "Unregistering service maintenance callback failed");
+
+       return unregister_all();
+}
+
  /* check service may be active when service is running on a second lcore */
  static int
  service_active_two_cores(void)
@@ -1071,6 +1124,7 @@ static struct unit_test_suite service_tests  = {
                TEST_CASE_ST(dummy_register, NULL, service_mt_unsafe_poll),
                TEST_CASE_ST(dummy_register, NULL, service_mt_safe_poll),
                TEST_CASE_ST(dummy_register, NULL, service_may_be_active),
+               TEST_CASE_ST(dummy_register, NULL, service_maintenance),
                TEST_CASE_ST(dummy_register, NULL, service_active_two_cores),
                TEST_CASES_END() /**< NULL terminate unit test array */
        }
diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
index dad3150df9..9c1fce8808 100644
--- a/lib/eal/common/rte_service.c
+++ b/lib/eal/common/rte_service.c
@@ -74,6 +74,7 @@ struct __rte_cache_aligned core_state {
        RTE_BITSET_DECLARE(service_active_on_lcore, RTE_SERVICE_NUM_MAX);
        RTE_ATOMIC(uint64_t) loops;
        RTE_ATOMIC(uint64_t) cycles;
+       rte_maint_func maint_callback;
        struct service_stats service_stats[RTE_SERVICE_NUM_MAX];
  };
@@ -317,6 +318,30 @@ rte_service_component_runstate_set(uint32_t id, uint32_t runstate)
        return 0;
  }
+int32_t
+rte_service_maint_callback_register(rte_maint_func callback, uint32_t lcore)
+{
+       struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states);
+       if (lcore >= RTE_MAX_LCORE || !cs->is_service_core
+                       || callback == NULL)
+               return -EINVAL;
+
+       cs->maint_callback = callback;
+       return 0;
+}
+
+int32_t
+rte_service_maint_callback_unregister(uint32_t lcore)
+{
+       struct core_state *cs = RTE_LCORE_VAR_LCORE(lcore, lcore_states);
+       if (lcore >= RTE_MAX_LCORE || !cs->is_service_core
+                       || !cs->maint_callback)
+               return -EINVAL;
+
+       cs->maint_callback = NULL;
+       return 0;
+}
+
  int32_t
  rte_service_runstate_set(uint32_t id, uint32_t runstate)
  {
@@ -378,16 +403,17 @@ service_counter_add(RTE_ATOMIC(uint64_t) *counter, 
uint64_t operand)
                                  rte_memory_order_relaxed);
  }
-static inline void
+static inline int32_t
  service_runner_do_callback(struct rte_service_spec_impl *s,
                           struct core_state *cs, uint32_t service_idx)
  {
        rte_eal_trace_service_run_begin(service_idx, rte_lcore_id());
        void *userdata = s->spec.callback_userdata;
+       int32_t rc;
if (service_stats_enabled(s)) {
                uint64_t start = rte_rdtsc();
-               int rc = s->spec.callback(userdata);
+               rc = s->spec.callback(userdata);
struct service_stats *service_stats =
                        &cs->service_stats[service_idx];
@@ -407,9 +433,10 @@ service_runner_do_callback(struct rte_service_spec_impl *s,
                        service_counter_add(&service_stats->cycles, cycles);
                }
        } else {
-               s->spec.callback(userdata);
+               rc = s->spec.callback(userdata);
        }
        rte_eal_trace_service_run_end(service_idx, rte_lcore_id());
+       return rc;
  }
@@ -436,16 +463,17 @@ service_run(uint32_t i, struct core_state *cs, const uint64_t *mapped_services, rte_bitset_set(cs->service_active_on_lcore, i); + int32_t ret;
        if ((service_mt_safe(s) == 0) && (serialize_mt_unsafe == 1)) {
                if (!rte_spinlock_trylock(&s->execute_lock))
                        return -EBUSY;
- service_runner_do_callback(s, cs, i);
+               ret = service_runner_do_callback(s, cs, i);
                rte_spinlock_unlock(&s->execute_lock);
        } else
-               service_runner_do_callback(s, cs, i);
+               ret = service_runner_do_callback(s, cs, i);
- return 0;
+       return ret;
  }
int32_t
@@ -488,6 +516,10 @@ rte_service_run_iter_on_app_lcore(uint32_t id, uint32_t 
serialize_mt_unsafe)
rte_atomic_fetch_sub_explicit(&s->num_mapped_cores, 1, rte_memory_order_relaxed); + /* Ignore service specific error codes */
+       if ((ret != -EINVAL) && (ret != -EBUSY) && (ret != -ENOEXEC))
+               ret = 0;
+
        return ret;
  }
@@ -505,13 +537,19 @@ service_runner_func(void *arg)
         */
        while (rte_atomic_load_explicit(&cs->runstate, 
rte_memory_order_acquire) ==
                        RUNSTATE_RUNNING) {
+               bool work_done = false;
                ssize_t id;
RTE_BITSET_FOREACH_SET(id, cs->mapped_services, RTE_SERVICE_NUM_MAX) {
                        /* return value ignored as no change to code flow */
-                       service_run(id, cs, cs->mapped_services, 
service_get(id), 1);
+                       int32_t ret = service_run(id, cs, cs->mapped_services, 
service_get(id), 1);
+                       if ((ret != -EAGAIN) && (ret != -EINVAL) && (ret != 
-ENOEXEC))
+                               work_done = true;
                }
+ if (cs->maint_callback != NULL)
+                       cs->maint_callback(work_done);
+
                rte_atomic_store_explicit(&cs->loops, cs->loops + 1, 
rte_memory_order_relaxed);
        }
diff --git a/lib/eal/include/rte_service.h b/lib/eal/include/rte_service.h
index 1236fe2dc6..f8adff8dfe 100644
--- a/lib/eal/include/rte_service.h
+++ b/lib/eal/include/rte_service.h
@@ -336,6 +336,40 @@ int32_t rte_service_lcore_reset_all(void);
   */
  int32_t rte_service_set_stats_enable(uint32_t id, int32_t enable);
+/**
+ * Signature of callback function to run after all services are done
+ *
+ * If registered, service framework will execute the callback
+ * indicating if any of the services run on the core was performing work.
+ * This may be used e.g. for maintenance or power saving when no work done.
+ */
+typedef void (*rte_maint_func)(bool work_done);
+
+/**
+ * Register service maintenance callback.
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *

It should be made clear which thread (the service lcore's) runs this callback, and when (after each iteration).

It should be clear if multiple callbacks are allowed per lcore.

What happens if a callback is already registered?

+ * @param callback Function callback to register
+ * @param lcore Id of the service core.

It could be useful to have shorthand for "all current service cores". Either a separate function, or a special lcore id for the above function.

LCORE_ID_ANY could be used, but would make it look like you registered the hook on any *one* service lcore, which wouldn't be the case.

Maybe not worth the trouble.

+ * @retval 0 Successfully registered the callback.
+ *         -EINVAL Attempted to register an invalid callback or the

What is an "invalid callback"? NULL?

+ *          lcore is not registered as service lcore
+ */
+__rte_experimental
+int32_t rte_service_maint_callback_register(rte_maint_func callback, uint32_t 
lcore);
+
+/**
+ * Unregister service maintenance callback.
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * @param lcore Id of the service core.
+ * @retval 0 Successfully unregistered the callback.
+ *         -EINVAL Attempted to unregister a callback from a core which
+ *        is not a service lcore or for which no maint callback was registered
+ */
+__rte_experimental
+int32_t rte_service_maint_callback_unregister(uint32_t lcore);
+
  /**
   * Retrieve the list of currently enabled service cores.
   *
diff --git a/lib/eal/version.map b/lib/eal/version.map
index a20c713eb1..e5355e956e 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -398,6 +398,10 @@ EXPERIMENTAL {
        # added in 24.11
        rte_bitset_to_str;
        rte_lcore_var_alloc;
+
+       # added in 25.03
+       rte_service_maint_callback_register;
+       rte_service_maint_callback_unregister;
  };
INTERNAL {

Reply via email to