> > > The concept of ‘ring with stages’ is similar to DPDK OPDL eventdev PMD [1], > > but the internals are different. > > In particular, SORING maintains internal array of 'states' for each element > > in the ring that is shared by all threads/processes that access the ring. > > That allows 'release' to avoid excessive waits on the tail value and helps > > to improve performancei and scalability. > > In terms of performance, with our measurements rte_soring and > > conventional rte_ring provide nearly identical numbers. > > As an example, on our SUT: Intel ICX CPU @ 2.00GHz, > > l3fwd (--lookup=acl) in pipeline mode [2] both > > rte_ring and rte_soring reach ~20Mpps for single I/O lcore and same > > number of worker lcores. > > > > [1] > > https://www.dpdk.org/wp-content/uploads/sites/35/2018/06/DPDK-China2017-Ma-OPDL.pdf > > [2] > > https://patchwork.dpdk.org/project/dpdk/patch/20240906131348.804-7-konstantin.v.anan...@yandex.ru/ > > One future suggestion. What about having an example (l3fwd-soring?) so > that performance can be compared. >
On early stages (RFC) I submitted a patch which allows l3fwd (ACL-case) to work in sort of pipeline mode: https://patchwork.dpdk.org/project/dpdk/patch/20240906131348.804-7-konstantin.v.anan...@yandex.ru/ So user can run it in one of the modes: run-to-completion/eventdev/rte_ring/rte_soring and measure performance differences. If there is a interest from the community, then yes we can try to make it a proper patch series for future releases.