Re: [RESEND v7 0/3] add telemetry cmds for ring
Hi, Thomas, Kindly ping for review. Thanks, Jie Hai On 2023/11/9 18:20, Jie Hai wrote: This patch set supports telemetry cmd to list rings and dump information of a ring by its name. v1->v2: 1. Add space after "switch". 2. Fix wrong strlen parameter. v2->v3: 1. Remove prefix "rte_" for static function. 2. Add Acked-by Konstantin Ananyev for PATCH 1. 3. Introduce functions to return strings instead copy strings. 4. Check pointer to memzone of ring. 5. Remove redundant variable. 6. Hold lock when access ring data. v3->v4: 1. Update changelog according to reviews of Honnappa Nagarahalli. 2. Add Reviewed-by Honnappa Nagarahalli. 3. Correct grammar in help information. 4. Correct spell warning on "te" reported by checkpatch.pl. 5. Use ring_walk() to query ring info instead of rte_ring_lookup(). 6. Fix that type definition the flag field of rte_ring does not match the usage. 7. Use rte_tel_data_add_dict_uint_hex instead of rte_tel_data_add_dict_u64 for mask and flags. v4->v5: 1. Add Acked-by Konstantin Ananyev and Chengwen Feng. 2. Add ABI change explanation for commit message of patch 1/3. v5->v6: 1. Add Acked-by Morten Br?rup. 2. Fix incorrect reference of commit. v6->v7: 1. Remove prod/consumer head/tail info. Jie Hai (3): ring: fix unmatched type definition and usage ring: add telemetry cmd to list rings ring: add telemetry cmd for ring info lib/ring/meson.build | 1 + lib/ring/rte_ring.c | 135 +++ lib/ring/rte_ring_core.h | 2 +- 3 files changed, 137 insertions(+), 1 deletion(-)
Potential RTE bitset RFC
Hi. The new timer RFC ("htimer") I submitted last year also included a new bitset API. https://patchwork.dpdk.org/project/dpdk/patch/20230315170342.214127-2-mattias.ronnb...@ericsson.com/ My experience is that multi-word bitsets are often useful. Examples from DPDK are rte_service.c and DSW its "service ports" bitset (both have 64 as a hard upper limit). Small, but multi-word, bitsets are not particularly hard to open-code, but then you end up with a lot of duplication. I wanted to ask if there is an interest in seeing a bitset API (as per my patchset) in DPDK. Upstreaming htimer, including having it replace today's rte_timer is more work than I can commit to, so I think you won't get RTE bitset that way any time soon. Regards, Mattias
Re: rte_atomic_*_explicit
On 2024-01-26 11:52, Morten Brørup wrote: From: Mattias Rönnblom [mailto:hof...@lysator.liu.se] Sent: Friday, 26 January 2024 09.07 On 2024-01-25 23:10, Morten Brørup wrote: From: Mattias Rönnblom [mailto:hof...@lysator.liu.se] Sent: Thursday, 25 January 2024 19.54 Why do rte_stdatomic.h functions have the suffix "_explicit"? Especially since there aren't any wrappers for the implicit variants. More to type, more to read. They have the "_explicit" suffix to make their names similar to those in stdatomic.h. OK, so to avoid a situation where someone accidentally misinterpret rte_atomic_fetch_add(&counter, 1, rte_memory_order_relaxed); as what, exactly? If you have a wrapper, why not take the opportunity and reap some benefits from this and fix/extend the standard API, making it a better fit for your application. Like removing the redundant "_explicit", "fetch"-less add/sub, maybe and a function for single-writer atomic add (non-atomic read + non-atomic add + atomic store), etc. Prohibiting implicit operations was done already, so it's already now not a straight-off copy of the standard API. You might consider their existence somewhat temporary until C11 stdatomics can be fully phased in, so there's another argument for similar names. (This probably does not happen as long as compilers generate slower code for C11 stdatomics than with their atomic built- ins.) To me, it seems reasonable a wrapper API should stay as long as it provide a benefit over the standard API. One could imagine that being a long time. I imagine some DPDK developers being tired of migrating from one atomics API to another. -> GCC built-ins (-> attempted C11 stdatomics?) -> . Now you are adding a future "-> CXY atomics" move as well, and with that it also seems natural to add a change back to a wrapper or complementary API, when CXY didn't turned out good enough for some particular platform, or when some non- complaint compiler comes along. Yes, more migrations seem to be on the roadmap. We can take the opportunity to change direction now, and decide to keep the API long term. Then it would need more documentation (basically copying function descriptions from ), and the functions could have the "_explicit" suffix removed (macros with the suffix could be added for backwards compatibility), and more functions - like the ones you suggested above - could be added. What do people think? 1. Keep considering a temporary wrapper for until compilers reach some undefined level of maturity, or 2. Consider stable, clean it up (remove "_explicit" suffix), add documentation to the macros, and extend it. Living in a DPDK-only environment, I would prefer option 2; but if mixing DPDK code with non-DPDK code (that uses ) it might be weird. I think DPDK should have its own atomics API, with C11-style naming and semantics, where there is an overlap, and keep it until the heat death of the universe. Is there an example of a large-scale project of the OS kernel or network stack kind that *doesn't* have its own atomics API (which wraps/extends C11 atomics, and/or uses compiler built-ins or in-line assembler)? fd.io VPP, OpenDataPlane (ODP) and the Linux kernel all have custom APIs. I suggested fixing the original API, or at least have a wrapper API, already at the point DPDK moved to direct GCC built-in calls. Then we wouldn't have had this atomics API ping-pong. The decision back then might have been too hasty, and based on incomplete assumptions. Please shout louder next time you think a mistake is in the making. When was this API introduced? Shouldn't it say "experimental" somewhere? They were introduced as part of the migration to C11. I suppose they were not marked experimental because they replaced something we didn't want anymore (the compiler built-ins for atomics, e.g. __atomic_load_n()). I don't recall if we discussed experimental marking or not. In hindsight, they should probably have been marked "experimental".
Re: [PATCH] RFC: use C11 alignas instead of GCC attribute aligned
On 2024-01-26 11:18, Morten Brørup wrote: From: Mattias Rönnblom [mailto:hof...@lysator.liu.se] Sent: Friday, 26 January 2024 11.05 On 2024-01-25 23:53, Morten Brørup wrote: From: Tyler Retzlaff [mailto:roret...@linux.microsoft.com] Sent: Thursday, 25 January 2024 19.37 ping. Please review this thread if you have time, the main point of discussion I would like to receive consensus on the following questions. 1. Should we continue to expand common alignments behind an __rte_macro i.e. what do we prefer to appear in code alignas(RTE_CACHE_LINE_MIN_SIZE) -- or -- __rte_cache_aligned One of the benefits of dropping the macro is it provides a clear visual indicator that it is not placed in the same location or get applied to types as is done with __attribute__((__aligned__(n))). We don't want our own proprietary variant of something that already exists in the C standard. Now that we have moved to C11, the __rte alignment macros should be considered obsolete. Making so something cache-line aligned is not in C11. We are talking about the __rte_aligned() macro, not the cache alignment macro. OK, in that case, what is the relevance of question 1 above? __rte_cache_aligned is shorter, provides a tiny bit of abstraction, and is already an established DPDK standard. So just keep the macro. If it would change, I would argue for it to be changed to rte_cache_aligned (i.e., just moving it out of __ namespace, and maybe making it all-uppercase). Non-trivial C programs wrap things all the time, standard or not. It's not something to be overly concerned about, imo. Using the cache alignment macro was obviously a bad example for discussing the __rte_aligned() macro. FYI, Tyler later agreed to introducing the RTE_CACHE_ALIGNAS you had proposed in an earlier correspondence. Note: I don't mind convenience macros for common use cases, so we could also introduce the macro suggested by Mattias [1]: #define RTE_CACHE_ALIGNAS alignas(RTE_CACHE_LINE_SIZE) [1]: https://inbox.dpdk.org/dev/dc3f3131-38e6-4219-861e- b31ec10c0...@lysator.liu.se/ 2. where should we place alignas(n) or __rte_macro (if we use a macro) Should it be on the same line as the variable or field or on the preceeding line? /* same line example struct */ struct T { /* alignas(64) applies to field0 *not* struct T type declaration */ alignas(64) void *field0; void *field1; ... other fields ... alignas(64) uint64_t field5; uint32_t field6; ... more fields ... }; /* same line example array */ alignas(64) static const uint32_t array[4] = { ... }; -- or -- /* preceeding line example struct */ struct T { /* alignas(64) applies to field0 *not* struct T type declaration */ alignas(64) void *field0; void *field1; ... other fields ... alignas(64) uint64_t field5; uint32_t field6; ... more fields ... }; /* preceeding line example array */ alignas(64) static const uint32_t array[4] = { ... }; Searching the net for what other projects do, I came across this required placement [2]: uint64_t alignas(64) field5; [2]: https://lore.kernel.org/buildroot/2023073851.6faa3391@windsurf/T/ So let's follow the standard's intention and put them on the same line. On an case-by-case basis, we can wrap lines if it improves readability, like we do with function headers that have a lot of attributes. I'll submit patches for lib/* once the discussion is concluded. thanks folks
Re: [RFC] service: extend service function call statistics
On 2024-01-26 11:07, Morten Brørup wrote: From: Mattias Rönnblom [mailto:hof...@lysator.liu.se] Sent: Friday, 26 January 2024 09.28 On 2024-01-26 00:19, Morten Brørup wrote: From: Mattias Rönnblom [mailto:mattias.ronnb...@ericsson.com] Sent: Thursday, 25 January 2024 20.15 Add two new per-service counters. RTE_SERVICE_ATTR_IDLE_CALL_COUNT tracks the number of service function invocations where no work was performed. RTE_SERVICE_ATTR_ERROR_CALL_COUNT tracks the number invocations resulting in an error. The semantics of RTE_SERVICE_ATTR_CALL_COUNT remains the same (i.e., counting all invocations, regardless of return value). The new statistics may be useful for both debugging and profiling (e.g., calculate the average per-call processing latency for non- idle service calls). Service core tests are extended to cover the new counters, and coverage for RTE_SERVICE_ATTR_CALL_COUNT is improved. OK to all of the above. Good stuff. The documentation for the CYCLES attributes are updated to reflect their actual semantics. If this is intended behavior, then updating the documentation seems appropriate - I would even go so far as considering it a bug fix. However, quite a few cycles may be consumed by a service before it can conclude that it had no work to do. Shouldn't that be considered time spent by the service? I.e. should the code be fixed instead of the documentation? Generally, polling for new work in the service is cheap, in my experience. But there's nothing in principle that prevents the situation your describe from occurring. You could add an "IDLE_CYCLES" counter in case that would ever be a real-world problem. That wouldn't be a fix, but rather just returning to the old, subtly broken, (pre-22.11?) semantics. Have a look at 809bd24 to see the rationale for the change. There's an example in 4689c57. The cause of this ambiguity is due to the fact that the platform/OS (i.e., DPDK) doesn't know which service to "wake up" (which in turn is the result of the platform not being able to tracking input sources, like NIC RX queues or timer wheels) and thus must ask every service to check if it has something to do. OK. Makes good sense. So definitely fix the documentation, not the code. :-) Alternatively, keep the behavior (for backwards compatibility) and fix the documentation, as this patch does, and add an IDLE_CYCLES counter for time spent in idle calls. PS: We're not using DPDK service cores in our applications, so I'm not familiar with the details. We are using something somewhat similar (but homegrown), also for profiling and power management purposes, and my feedback is based on my experience with our own variant of service cores. When are you making the switch to service cores? :) Our own "service cores" implementation has some slightly different properties, which we are still experimenting with. E.g. in addition to the special return value "idle (no work to do)", we also have a special return value for "incomplete (more work urgently pending)" when a service processed a full burst and still has more work pending its input queue. We are also considering returning a value to inform what time it needs to be called again. This concept is only an idea, and we haven't started experimenting with it yet. From a high level perspective, the service cores library is much like an operating system's CPU scheduler, although based on voluntary time sharing. Many algorithms and many parameters can be considered. It can also tie into power management and prioritization of different tasks. Service cores in their current form is more like a primitive kernel bottom half implementation. The primary work scheduler in DPDK is Eventdev, but even with Eventdev you want some way to fit non-event kind of processing as well, and here services cores serves a role. rte_service.c is a good place for power management. I agree some means for the services to convey what latency (e.g., sleep time) is acceptable is needed. Returning a time per service function invocation would be a flexible way to do so. Could also just be a per-lcore, or global configuration, set by the application. Either way: Acked-by: Morten Brørup Thanks for the review.
Re: rte_atomic_*_explicit
On 2024-01-26 22:35, Tyler Retzlaff wrote: On Fri, Jan 26, 2024 at 11:52:11AM +0100, Morten Brørup wrote: From: Mattias Rönnblom [mailto:hof...@lysator.liu.se] Sent: Friday, 26 January 2024 09.07 On 2024-01-25 23:10, Morten Brørup wrote: From: Mattias Rönnblom [mailto:hof...@lysator.liu.se] Sent: Thursday, 25 January 2024 19.54 Why do rte_stdatomic.h functions have the suffix "_explicit"? Especially since there aren't any wrappers for the implicit variants. More to type, more to read. They have the "_explicit" suffix to make their names similar to those in stdatomic.h. OK, so to avoid a situation where someone accidentally misinterpret rte_atomic_fetch_add(&counter, 1, rte_memory_order_relaxed); as what, exactly? If you have a wrapper, why not take the opportunity and reap some benefits from this and fix/extend the standard API, making it a better fit for your application. Like removing the redundant "_explicit", "fetch"-less add/sub, maybe and a function for single-writer atomic add (non-atomic read + non-atomic add + atomic store), etc. Prohibiting implicit operations was done already, so it's already now not a straight-off copy of the standard API. You might consider their existence somewhat temporary until C11 stdatomics can be fully phased in, so there's another argument for similar names. (This probably does not happen as long as compilers generate slower code for C11 stdatomics than with their atomic built- ins.) To me, it seems reasonable a wrapper API should stay as long as it provide a benefit over the standard API. One could imagine that being a long time. I imagine some DPDK developers being tired of migrating from one atomics API to another. -> GCC built-ins (-> attempted C11 stdatomics?) -> . Now you are adding a future "-> CXY atomics" move as well, and with that it also seems natural to add a change back to a wrapper or complementary API, when CXY didn't turned out good enough for some particular platform, or when some non- complaint compiler comes along. Yes, more migrations seem to be on the roadmap. We can take the opportunity to change direction now, and decide to keep the API long term. Then it would need more documentation (basically copying function descriptions from ), and the functions could have the "_explicit" suffix removed (macros with the suffix could be added for backwards compatibility), and more functions - like the ones you suggested above - could be added. What do people think? 1. Keep considering a temporary wrapper for until compilers reach some undefined level of maturity, or 2. Consider stable, clean it up (remove "_explicit" suffix), add documentation to the macros, and extend it. Living in a DPDK-only environment, I would prefer option 2; but if mixing DPDK code with non-DPDK code (that uses ) it might be weird. rte_stdatomic.h should be considered temporary, but how long temporary is depends on when we can deprecate support for distributions and the older toolchains they are tied to. the macros were introduced to allow a path to gradually moving to standard c11 atomics. gcc versions available on distributions we promise support for is currently the biggest barrier to direct adoption. * older versions of gcc generate sub-optimal code when using c11 atomic generics in some instances. * gcc c++23 support is incomplete, in particular at the time we evaluated using the c11 atomics directly but gcc c++23 held us back because the stdatomic.h it shipped didn't interoperate. So you expect at a particular point in time, both the following will be true: * All currently (at that point) supported and future compilers generate correct and efficient code for all currently and future ISA/CPUs. * There's nothing add, subtract or clean up in the standard API to improve it for DPDK-internal and DPDK application code use. Seeing things like rte_atomic_fetch_add_explicit(&s->num_mapped_cores, 1, rte_memory_order_relaxed); doesn't exactly make you long for "raw" C11 atomics. It's not the like "rte_" being deleted improve much on that mess. Compare it with the kernel's: atomic_inc(&s->num_mapped_cores); What could in DPDK be: rte_atomic_inc(&s->num_mapped_cores); or, at a bare minimum rte_atomic_inc(&->num_mapped_cores, rte_memory_order_relaxed); * the macros allow developers to easily evaluate/compare with and without standard C atomics with a single build-time option -Denable_stdatomic=true * eventually we will want to move directly to the names from the standard, arguably just search and replace of rte_atomic with atomic is the most mechanically trivial - hence there is some value in keeping _explicit suffix. is a public API, and thus will be picked up by applications wanting to leverage DPDK for such services. Search-and-replace will only affect DPDK itself. I suggested fixing the original API, or at least have a wrapper API, al