Hi Honnappa,

>From a quick walkthrough, I have some questions/comments, please
see below.

On Tue, Oct 06, 2020 at 08:29:05AM -0500, Honnappa Nagarahalli wrote:
> Add scatter gather APIs to avoid intermediate memcpy. Use cases
> that involve copying large amount of data to/from the ring
> can benefit from these APIs.
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> ---
>  lib/librte_ring/meson.build        |   3 +-
>  lib/librte_ring/rte_ring_elem.h    |   1 +
>  lib/librte_ring/rte_ring_peek_sg.h | 552 +++++++++++++++++++++++++++++
>  3 files changed, 555 insertions(+), 1 deletion(-)
>  create mode 100644 lib/librte_ring/rte_ring_peek_sg.h
> 
> diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
> index 31c0b4649..377694713 100644
> --- a/lib/librte_ring/meson.build
> +++ b/lib/librte_ring/meson.build
> @@ -12,4 +12,5 @@ headers = files('rte_ring.h',
>               'rte_ring_peek.h',
>               'rte_ring_peek_c11_mem.h',
>               'rte_ring_rts.h',
> -             'rte_ring_rts_c11_mem.h')
> +             'rte_ring_rts_c11_mem.h',
> +             'rte_ring_peek_sg.h')
> diff --git a/lib/librte_ring/rte_ring_elem.h b/lib/librte_ring/rte_ring_elem.h
> index 938b398fc..7d3933f15 100644
> --- a/lib/librte_ring/rte_ring_elem.h
> +++ b/lib/librte_ring/rte_ring_elem.h
> @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r, void 
> *obj_table,
>  
>  #ifdef ALLOW_EXPERIMENTAL_API
>  #include <rte_ring_peek.h>
> +#include <rte_ring_peek_sg.h>
>  #endif
>  
>  #include <rte_ring.h>
> diff --git a/lib/librte_ring/rte_ring_peek_sg.h 
> b/lib/librte_ring/rte_ring_peek_sg.h
> new file mode 100644
> index 000000000..97d5764a6
> --- /dev/null
> +++ b/lib/librte_ring/rte_ring_peek_sg.h
> @@ -0,0 +1,552 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + *
> + * Copyright (c) 2020 Arm
> + * Copyright (c) 2007-2009 Kip Macy km...@freebsd.org
> + * All rights reserved.
> + * Derived from FreeBSD's bufring.h
> + * Used as BSD-3 Licensed with permission from Kip Macy.
> + */
> +
> +#ifndef _RTE_RING_PEEK_SG_H_
> +#define _RTE_RING_PEEK_SG_H_
> +
> +/**
> + * @file
> + * @b EXPERIMENTAL: this API may change without prior notice
> + * It is not recommended to include this file directly.
> + * Please include <rte_ring_elem.h> instead.
> + *
> + * Ring Peek Scatter Gather APIs

I am not fully convinced by the API name. To me, "scatter/gather" is
associated to iovecs, like for instance in [1]. The wikipedia definition
[2] may be closer though.

[1] https://www.gnu.org/software/libc/manual/html_node/Scatter_002dGather.html
[2] https://en.wikipedia.org/wiki/Gather-scatter_(vector_addressing)

What about "zero-copy"?

Also, the "peek" term looks also a bit confusing to me, but it existed
before your patch. I understand it for dequeue, but not for enqueue.

Or, what about replacing the existing experimental peek API by this one?
They look quite similar to me.

> + * Introduction of rte_ring with scatter gather serialized producer/consumer
> + * (HTS sync mode) makes it possible to split public enqueue/dequeue API
> + * into 3 phases:
> + * - enqueue/dequeue start
> + * - copy data to/from the ring
> + * - enqueue/dequeue finish
> + * Along with the advantages of the peek APIs, these APIs provide the ability
> + * to avoid copying of the data to temporary area.
> + *
> + * Note that right now this new API is available only for two sync modes:
> + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
> + * 2) Serialized Producer/Serialized Consumer (RTE_RING_SYNC_MT_HTS).
> + * It is a user responsibility to create/init ring with appropriate sync
> + * modes selected.
> + *
> + * Example usage:
> + * // read 1 elem from the ring:

Comment should be "prepare enqueuing 32 objects"

> + * n = rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL);
> + * if (n != 0) {
> + *   //Copy objects in the ring
> + *   memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t));
> + *   if (n != sgd->n1)
> + *           //Second memcpy because of wrapround
> + *           n2 = n - sgd->n1;
> + *           memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t));

Missing { }

> + *   rte_ring_dequeue_sg_finish(ring, n);

Should be enqueue

> + * }
> + *
> + * Note that between _start_ and _finish_ none other thread can proceed
> + * with enqueue(/dequeue) operation till _finish_ completes.
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_ring_peek_c11_mem.h>
> +
> +/* Rock that needs to be passed between reserve and commit APIs */
> +struct rte_ring_sg_data {
> +     /* Pointer to the first space in the ring */
> +     void **ptr1;
> +     /* Pointer to the second space in the ring if there is wrap-around */
> +     void **ptr2;
> +     /* Number of elements in the first pointer. If this is equal to
> +      * the number of elements requested, then ptr2 is NULL.
> +      * Otherwise, subtracting n1 from number of elements requested
> +      * will give the number of elements available at ptr2.
> +      */
> +     unsigned int n1;
> +};

Would it be possible to simply return the offset instead of this structure?
The wrap could be managed by a __rte_ring_enqueue_elems() function.

I mean something like this:

        uint32_t start;
        n = rte_ring_enqueue_sg_bulk_start(ring, 32, &start, NULL);
        if (n != 0) {
                /* Copy objects in the ring. */
                __rte_ring_enqueue_elems(ring, start, obj, sizeof(uintptr_t), 
n);
                rte_ring_enqueue_sg_finish(ring, n);
        }

It would require to slightly change __rte_ring_enqueue_elems() to support
to be called with prod_head >= size, and wrap in that case.


Regards,
Olivier

Reply via email to