> > On Mon, Apr 20, 2020 at 2:12 PM Konstantin Ananyev > <konstantin.anan...@intel.com> wrote: > > > > Changed the rte_ring chapter in programmer's guide to reflect > > the addition of new sync modes and peek style API. > > I'd like to split this as follows, see below. > I have a couple of typos too. > > > If you are fine with it, I'll proceed and squash when merging.
Yes, I am. Thanks Konstantin > > > > > > Signed-off-by: Konstantin Ananyev <konstantin.anan...@intel.com> > > --- > > doc/guides/prog_guide/ring_lib.rst | 95 ++++++++++++++++++++++++++++++ > > 1 file changed, 95 insertions(+) > > > > diff --git a/doc/guides/prog_guide/ring_lib.rst > > b/doc/guides/prog_guide/ring_lib.rst > > index 8cb2b2dd4..668e67ecb 100644 > > --- a/doc/guides/prog_guide/ring_lib.rst > > +++ b/doc/guides/prog_guide/ring_lib.rst > > @@ -349,6 +349,101 @@ even if only the first term of subtraction has > > overflowed: > > uint32_t entries = (prod_tail - cons_head); > > uint32_t free_entries = (mask + cons_tail -prod_head); > > > > From here, this first part would go to patch2 "ring: prepare ring to > allow new sync schemes". > > > +Producer/consumer synchronization modes > > +--------------------------------------- > > + > > +rte_ring supports different synchronization modes for porducer and > > consumers. > > producers* > > > +These modes can be specified at ring creation/init time via ``flags`` > > parameter. > > +That should help user to configure ring in way most suitable for his > > double space to remove. > users? > > > > +specific usage scenarios. > > +Currently supported modes: > > + > > +MP/MC (default one) > > +~~~~~~~~~~~~~~~~~~~ > > + > > +Multi-producer (/multi-consumer) mode. This is a default enqueue (/dequeue) > > +mode for the ring. In this mode multiple threads can enqueue (/dequeue) > > +objects to (/from) the ring. For 'classic' DPDK deployments (with one > > thread > > +per core) this is usually most suitable and fastest synchronization mode. > > the most* > > > +As a well known limitaion - it can perform quite pure on some overcommitted > > limitation* > > > +scenarios. > > + > > +SP/SC > > +~~~~~ > > +Single-producer (/single-consumer) mode. In this mode only one thread at a > > time > > +is allowed to enqueue (/dequeue) objects to (/from) the ring. > > End of first part. > > Then the second part that would go to patch3 "ring: introduce RTS ring mode". > > > + > > +MP_RTS/MC_RTS > > +~~~~~~~~~~~~~ > > + > > +Multi-producer (/multi-consumer) with Relaxed Tail Sync (RTS) mode. > > +The main difference from original MP/MC algorithm is that > > from the original* > > > +tail value is increased not by every thread that finished enqueue/dequeue, > > +but only by the last one. > > +That allows threads to avoid spinning on ring tail value, > > +leaving actual tail value change to the last thread at a given instance. > > +That technique helps to avoid Lock-Waiter-Preemtion (LWP) problem on tail > > the Lock-Waiter-Preemption* > > > +update and improves average enqueue/dequeue times on overcommitted systems. > > +To achieve that RTS requires 2 64-bit CAS for each enqueue(/dequeue) > > operation: > > +one for head update, second for tail update. > > +In comparison original MP/MC algorithm requires one 32-bit CAS > > the original* > > > +for head update and waiting/spinning on tail value. > > + > > End of second part. > > Third part that would go to patch 5 "ring: introduce HTS ring mode". > > > > +MP_HTS/MC_HTS > > +~~~~~~~~~~~~~ > > + > > +Multi-producer (/multi-consumer) with Head/Tail Sync (HTS) mode. > > +In that mode enqueue/dequeue operation is fully serialized: > > +at any given moment only one enqueue/dequeue operation can proceed. > > +This is achieved by allowing a thread to proceed with changing > > ``head.value`` > > +only when ``head.value == tail.value``. > > +Both head and tail values are updated atomically (as one 64-bit value). > > +To achieve that 64-bit CAS is used by head update routine. > > +That technique also avoids Lock-Waiter-Preemtion (LWP) problem on tail > > the Lock-Waiter-Preemption* > > > > +update and helps to improve ring enqueue/dequeue behavior in overcommitted > > +scenarios. Another advantage of fully serialized producer/consumer - > > +it provides ability to implement MT safe peek API for rte_ring. > > it provides the ability* > > End of 3rd part. > > Last part would go to patch 7 "ring: introduce peek style API". > > > + > > + > > +Ring Peek API > > +------------- > > + > > +For ring with serialized producer/consumer (HTS sync mode) it is possible > > double space. > > > +to split public enqueue/dequeue API into two phases: > > + > > +* enqueue/dequeue start > > + > > +* enqueue/dequeue finish > > + > > +That allows user to inspect objects in the ring without removing them > > +from it (aka MT safe peek) and reserve space for the objects in the ring > > +before actual enqueue. > > +Note that this API is available only for two sync modes: > > + > > +* Single Producer/Single Consumer (SP/SC) > > + > > +* Multi-producer/Multi-consumer with Head/Tail Sync (HTS) > > + > > +It is a user responsibility to create/init ring with appropriate sync modes > > +selected. As an example of usage: > > + > > +.. code-block:: c > > + > > + /* read 1 elem from the ring: */ > > + uint32_t n = rte_ring_dequeue_bulk_start(ring, &obj, 1, NULL); > > + if (n != 0) { > > + /* examine object */ > > + if (object_examine(obj) == KEEP) > > + /* decided to keep it in the ring. */ > > + rte_ring_dequeue_finish(ring, 0); > > + else > > + /* decided to remove it from the ring. */ > > + rte_ring_dequeue_finish(ring, n); > > + } > > + > > +Note that between ``_start_`` and ``_finish_`` none other thread can > > proceed > > +with enqueue(/dequeue) operation till ``_finish_`` completes. > > + > > > > -- > David Marchand