> From: Robin Jarry [mailto:rja...@redhat.com] > Sent: Monday, 28 November 2022 10.00 > > Allow applications to register a callback that will be invoked in > rte_lcore_dump() and when requesting lcore info in the telemetry API. > > The callback is expected to return the number of CPU cycles that have > passed since application start and the number of these cycles that were > spent doing busy work. > > Cc: Bruce Richardson <bruce.richard...@intel.com> > Cc: Jerin Jacob <jer...@marvell.com> > Cc: Kevin Laatz <kevin.la...@intel.com> > Cc: Konstantin Ananyev <konstantin.v.anan...@yandex.ru> > Cc: Mattias Rönnblom <mattias.ronnb...@ericsson.com> > Cc: Morten Brørup <m...@smartsharesystems.com> > Signed-off-by: Robin Jarry <rja...@redhat.com> > --- > v1 -> v2: > > Changed the approach based on Morten's review: the callback is now > expected to report the total number of cycles since application start > and the amount of these cycles that were spent doing busy work. This > will give more flexibility in external monitoring tools to decide the > sample period to compute busyness ratio. > > lib/eal/common/eal_common_lcore.c | 31 ++++++++++++++++++++++++++++--- > lib/eal/include/rte_lcore.h | 29 +++++++++++++++++++++++++++++ > lib/eal/version.map | 1 + > 3 files changed, 58 insertions(+), 3 deletions(-) > > diff --git a/lib/eal/common/eal_common_lcore.c > b/lib/eal/common/eal_common_lcore.c > index 8a6c12550238..51f53fc93ece 100644 > --- a/lib/eal/common/eal_common_lcore.c > +++ b/lib/eal/common/eal_common_lcore.c > @@ -2,6 +2,7 @@ > * Copyright(c) 2010-2014 Intel Corporation > */ > > +#include <inttypes.h> > #include <stdlib.h> > #include <string.h> > > @@ -420,11 +421,20 @@ rte_lcore_iterate(rte_lcore_iterate_cb cb, void > *arg) > return ret; > } > > +static rte_lcore_usage_cb lcore_usage_cb; > + > +void > +rte_lcore_register_usage_cb(rte_lcore_usage_cb cb) > +{ > + lcore_usage_cb = cb; > +} > + > static int > lcore_dump_cb(unsigned int lcore_id, void *arg) > { > struct rte_config *cfg = rte_eal_get_configuration(); > - char cpuset[RTE_CPU_AFFINITY_STR_LEN]; > + char cpuset[RTE_CPU_AFFINITY_STR_LEN], usage_str[256]; > + uint64_t busy_cycles, total_cycles; > const char *role; > FILE *f = arg; > int ret; > @@ -444,11 +454,19 @@ lcore_dump_cb(unsigned int lcore_id, void *arg) > break; > } > > + busy_cycles = 0; > + total_cycles = 0; > + usage_str[0] = '\0'; > + if (lcore_usage_cb && lcore_usage_cb(lcore_id, &busy_cycles, > &total_cycles) == 0) {
The DPDK coding convention is to explicitly compare to NULL, i.e.: if (lcore_usage_cb != NULL && lcore_usage_cb(... > + snprintf(usage_str, sizeof(usage_str), ", busy cycles > %"PRIu64"/%"PRIu64, > + busy_cycles, total_cycles); Consider adding the percentage here, for easy human consumption: ", busy cycles %"PRIu64"/%"PRIu64" (%.02f%%)", busy_cycles, total_cycles, busy_cycles ? (float)busy_cycles / (float)total_cycles * (float)100); On the other hand, it is the average over the total uptime, so the percentage might only be useful for very few cases. > + } > ret = eal_thread_dump_affinity(&lcore_config[lcore_id].cpuset, > cpuset, > sizeof(cpuset)); > - fprintf(f, "lcore %u, socket %u, role %s, cpuset %s%s\n", > lcore_id, > + fprintf(f, "lcore %u, socket %u, role %s, cpuset %s%s%s\n", > lcore_id, > rte_lcore_to_socket_id(lcore_id), role, cpuset, > - ret == 0 ? "" : "..."); > + ret == 0 ? "" : "...", usage_str); > + > return 0; > } > > @@ -486,6 +504,7 @@ lcore_telemetry_info_cb(unsigned int lcore_id, void > *arg) > { > struct lcore_telemetry_info *info = arg; > struct rte_config *cfg = rte_eal_get_configuration(); > + uint64_t busy_cycles, total_cycles; > struct rte_tel_data *cpuset; > const char *role; > unsigned int cpu; > @@ -519,6 +538,12 @@ lcore_telemetry_info_cb(unsigned int lcore_id, > void *arg) > if (CPU_ISSET(cpu, &lcore_config[lcore_id].cpuset)) > rte_tel_data_add_array_int(cpuset, cpu); > rte_tel_data_add_dict_container(info->d, "cpuset", cpuset, 0); > + busy_cycles = 0; > + total_cycles = 0; > + if (lcore_usage_cb && lcore_usage_cb(lcore_id, &busy_cycles, > &total_cycles) == 0) { Same comment about coding convention: if (lcore_usage_cb != NULL && lcore_usage_cb(... > + rte_tel_data_add_dict_u64(info->d, "busy_cycles", > busy_cycles); > + rte_tel_data_add_dict_u64(info->d, "total_cycles", > total_cycles); > + } > > return 0; > } > diff --git a/lib/eal/include/rte_lcore.h b/lib/eal/include/rte_lcore.h > index 6938c3fd7b81..dc352297bcbc 100644 > --- a/lib/eal/include/rte_lcore.h > +++ b/lib/eal/include/rte_lcore.h > @@ -328,6 +328,35 @@ typedef int (*rte_lcore_iterate_cb)(unsigned int > lcore_id, void *arg); > int > rte_lcore_iterate(rte_lcore_iterate_cb cb, void *arg); > > +/** > + * Callback to allow applications to report CPU usage. > + * > + * @param [in] lcore_id > + * The lcore to consider. > + * @param [out] busy > + * The number of busy CPU cycles since the application start. > + * @param [out] total > + * The total number of CPU cycles since the application start. > + * @return > + * - 0 if both busy and total were set correctly. > + * - a negative value if the information is not available or if any > error occurred. > + */ > +typedef int (*rte_lcore_usage_cb)( > + unsigned int lcore_id, uint64_t *busy_cycles, uint64_t > *total_cycles); > + > +/** > + * Register a callback from an application to be called in > rte_lcore_dump() > + * and the /eal/lcore/info telemetry endpoint handler. > + * > + * Applications are expected to report the amount of busy and total > CPU cycles > + * since their startup. > + * > + * @param cb > + * The callback function. > + */ > +__rte_experimental > +void rte_lcore_register_usage_cb(rte_lcore_usage_cb cb); > + > /** > * List all lcores. > * > diff --git a/lib/eal/version.map b/lib/eal/version.map > index 7ad12a7dc985..30fd216a12ea 100644 > --- a/lib/eal/version.map > +++ b/lib/eal/version.map > @@ -440,6 +440,7 @@ EXPERIMENTAL { > rte_thread_detach; > rte_thread_equal; > rte_thread_join; > + rte_lcore_register_usage_cb; > }; > > INTERNAL { > -- > 2.38.1 > Looks good to me. And we could probably discuss naming forever... "Usage" and "utilization" are synonyms, but usage is shorter, so let's stick with that. Acked-by: Morten Brørup <m...@smartsharesystems.com>