<snip> > > > > > Hi Honnappa, > > > > From a quick walkthrough, I have some questions/comments, please see > > below. > Hi Olivier, appreciate your input. > > > > > On Tue, Oct 06, 2020 at 08:29:05AM -0500, Honnappa Nagarahalli wrote: > > > Add scatter gather APIs to avoid intermediate memcpy. Use cases that > > > involve copying large amount of data to/from the ring can benefit > > > from these APIs. > > > > > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > > > --- > > > lib/librte_ring/meson.build | 3 +- > > > lib/librte_ring/rte_ring_elem.h | 1 + > > > lib/librte_ring/rte_ring_peek_sg.h | 552 > > > +++++++++++++++++++++++++++++ > > > 3 files changed, 555 insertions(+), 1 deletion(-) create mode > > > 100644 lib/librte_ring/rte_ring_peek_sg.h > > > > > > diff --git a/lib/librte_ring/meson.build > > > b/lib/librte_ring/meson.build index 31c0b4649..377694713 100644 > > > --- a/lib/librte_ring/meson.build > > > +++ b/lib/librte_ring/meson.build > > > @@ -12,4 +12,5 @@ headers = files('rte_ring.h', > > > 'rte_ring_peek.h', > > > 'rte_ring_peek_c11_mem.h', > > > 'rte_ring_rts.h', > > > - 'rte_ring_rts_c11_mem.h') > > > + 'rte_ring_rts_c11_mem.h', > > > + 'rte_ring_peek_sg.h') > > > diff --git a/lib/librte_ring/rte_ring_elem.h > > > b/lib/librte_ring/rte_ring_elem.h index 938b398fc..7d3933f15 100644 > > > --- a/lib/librte_ring/rte_ring_elem.h > > > +++ b/lib/librte_ring/rte_ring_elem.h > > > @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring > > > *r, void *obj_table, > > > > > > #ifdef ALLOW_EXPERIMENTAL_API > > > #include <rte_ring_peek.h> > > > +#include <rte_ring_peek_sg.h> > > > #endif > > > > > > #include <rte_ring.h> > > > diff --git a/lib/librte_ring/rte_ring_peek_sg.h > > > b/lib/librte_ring/rte_ring_peek_sg.h > > > new file mode 100644 > > > index 000000000..97d5764a6 > > > --- /dev/null > > > +++ b/lib/librte_ring/rte_ring_peek_sg.h > > > @@ -0,0 +1,552 @@ > > > +/* SPDX-License-Identifier: BSD-3-Clause > > > + * > > > + * Copyright (c) 2020 Arm > > > + * Copyright (c) 2007-2009 Kip Macy km...@freebsd.org > > > + * All rights reserved. > > > + * Derived from FreeBSD's bufring.h > > > + * Used as BSD-3 Licensed with permission from Kip Macy. > > > + */ > > > + > > > +#ifndef _RTE_RING_PEEK_SG_H_ > > > +#define _RTE_RING_PEEK_SG_H_ > > > + > > > +/** > > > + * @file > > > + * @b EXPERIMENTAL: this API may change without prior notice > > > + * It is not recommended to include this file directly. > > > + * Please include <rte_ring_elem.h> instead. > > > + * > > > + * Ring Peek Scatter Gather APIs > > > > I am not fully convinced by the API name. To me, "scatter/gather" is > > associated to iovecs, like for instance in [1]. The wikipedia > > definition [2] may be closer though. > > > > [1] > > > https://www.gnu.org/software/libc/manual/html_node/Scatter_002dGathe > > r.html > > [2] https://en.wikipedia.org/wiki/Gather-scatter_(vector_addressing) > The way I understand scatter-gather is, the data to be sent to something (like > a device) is coming from multiple sources. It would require copying to put the > data together to be contiguous. If the device supports scatter-gather, such > copying is not required. The device can collect data from multiple locations > and make it contiguous. > > In the case I was looking at, one part of the data was coming from the user of > the API and another was generated by the API itself. If these two pieces of > information have to be enqueued as a single object on the ring, I had to > create an intermediate copy. But by exposing the ring memory to the user, > the intermediate copy is avoided. Hence I called it scatter-gather. > > > > > What about "zero-copy"? > I think no-copy (nc for short) or user-copy (uc for short) would convey the > meaning better. These would indicate that the rte_ring APIs are not copying > the objects and it is left to the user to do the actual copy. > > > > > Also, the "peek" term looks also a bit confusing to me, but it existed > > before your patch. I understand it for dequeue, but not for enqueue. > I kept 'peek' there because the API still offers the 'peek' API capabilities. > I am > also not sure on what 'peek' means for enqueue API. The enqueue 'peek' > API was provided to be symmetric with dequeue peek API. > > > > > Or, what about replacing the existing experimental peek API by this one? > > They look quite similar to me. > I agree, scatter gather APIs provide the peek capability and the no-copy > benefits. > Konstantin, any opinions here? > > > > > > + * Introduction of rte_ring with scatter gather serialized > > > + producer/consumer > > > + * (HTS sync mode) makes it possible to split public > > > + enqueue/dequeue API > > > + * into 3 phases: > > > + * - enqueue/dequeue start > > > + * - copy data to/from the ring > > > + * - enqueue/dequeue finish > > > + * Along with the advantages of the peek APIs, these APIs provide > > > + the ability > > > + * to avoid copying of the data to temporary area. > > > + * > > > + * Note that right now this new API is available only for two sync modes: > > > + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST) > > > + * 2) Serialized Producer/Serialized Consumer > (RTE_RING_SYNC_MT_HTS). > > > + * It is a user responsibility to create/init ring with appropriate > > > + sync > > > + * modes selected. > > > + * > > > + * Example usage: > > > + * // read 1 elem from the ring: > > > > Comment should be "prepare enqueuing 32 objects" > > > > > + * n = rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL); > > > + * if (n != 0) { > > > + * //Copy objects in the ring > > > + * memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t)); > > > + * if (n != sgd->n1) > > > + * //Second memcpy because of wrapround > > > + * n2 = n - sgd->n1; > > > + * memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t)); > > > > Missing { } > > > > > + * rte_ring_dequeue_sg_finish(ring, n); > > > > Should be enqueue > > > Thanks, will correct them. > > > > + * } > > > + * > > > + * Note that between _start_ and _finish_ none other thread can > > > + proceed > > > + * with enqueue(/dequeue) operation till _finish_ completes. > > > + */ > > > + > > > +#ifdef __cplusplus > > > +extern "C" { > > > +#endif > > > + > > > +#include <rte_ring_peek_c11_mem.h> > > > + > > > +/* Rock that needs to be passed between reserve and commit APIs */ > > > +struct rte_ring_sg_data { > > > + /* Pointer to the first space in the ring */ > > > + void **ptr1; > > > + /* Pointer to the second space in the ring if there is wrap-around */ > > > + void **ptr2; > > > + /* Number of elements in the first pointer. If this is equal to > > > + * the number of elements requested, then ptr2 is NULL. > > > + * Otherwise, subtracting n1 from number of elements requested > > > + * will give the number of elements available at ptr2. > > > + */ > > > + unsigned int n1; > > > +}; > > > > Would it be possible to simply return the offset instead of this structure? > > The wrap could be managed by a __rte_ring_enqueue_elems() function. > Trying to use __rte_ring_enqueue_elems() will force temporary copy. See > below. > > > > > I mean something like this: > > > > uint32_t start; > > n = rte_ring_enqueue_sg_bulk_start(ring, 32, &start, NULL); > > if (n != 0) { > > /* Copy objects in the ring. */ > > __rte_ring_enqueue_elems(ring, start, obj, sizeof(uintptr_t), > n); > For ex: 'obj' here is temporary copy. The example I provided in the comment at the top of the file is not good. I will replace the 'memcpy' with calling another API that copies the data to the ring directly. That should show the clear benefit.
> > > rte_ring_enqueue_sg_finish(ring, n); > > } > > > > It would require to slightly change __rte_ring_enqueue_elems() to > > support to be called with prod_head >= size, and wrap in that case. > > > The alternate solution I can think of requires 3 things 1) the base address of > the ring 2) Index to where to copy 3) the mask. With these 3 things one could > write the code like below: > for (i = 0; i < n; i++) { > ring_addr[(index + i) & mask] = obj[i]; // ANDing with mask will take > care of wrap-around. > } > > However, I think this does not allow for passing the address in the ring to > another function/API to copy the data (It is possible, but the user has to > calculate the actual address, worry about the wrap-around, second pointer > etc). > > The current approach hides some details and provides flexibility to the > application to use the pointer the way it wants. > > > > > Regards, > > Olivier