<snip>

> Subject: RE: [PATCH v3 3/9] ring: introduce RTS ring mode
> 
> > > Introduce relaxed tail sync (RTS) mode for MT ring synchronization.
> > > Aim to reduce stall times in case when ring is used on overcommited
> > > cpus (multiple active threads on the same cpu).
> > > The main difference from original MP/MC algorithm is that tail value
> > > is increased not by every thread that finished enqueue/dequeue, but
> > > only by the last one.
> > > That allows threads to avoid spinning on ring tail value, leaving
> > > actual tail value change to the last thread in the update queue.
> > >
> > > check-abi.sh reports what I believe is a false-positive about ring
> > > cons/prod changes. As a workaround, devtools/libabigail.abignore is
> > > updated to suppress *struct ring* related errors.
> > This can be removed from the commit message.
> >
> > >
> > > Signed-off-by: Konstantin Ananyev <konstantin.anan...@intel.com>
> > > ---
> > >  devtools/libabigail.abignore           |   7 +
> > >  lib/librte_ring/Makefile               |   5 +-
> > >  lib/librte_ring/meson.build            |   5 +-
> > >  lib/librte_ring/rte_ring.c             | 100 +++++++-
> > >  lib/librte_ring/rte_ring.h             | 110 ++++++++-
> > >  lib/librte_ring/rte_ring_elem.h        |  86 ++++++-
> > >  lib/librte_ring/rte_ring_rts.h         | 316 +++++++++++++++++++++++++
> > >  lib/librte_ring/rte_ring_rts_elem.h    | 205 ++++++++++++++++
> > >  lib/librte_ring/rte_ring_rts_generic.h | 210 ++++++++++++++++
> > >  9 files changed, 1015 insertions(+), 29 deletions(-)  create mode
> > > 100644 lib/librte_ring/rte_ring_rts.h  create mode 100644
> > > lib/librte_ring/rte_ring_rts_elem.h
> > >  create mode 100644 lib/librte_ring/rte_ring_rts_generic.h
> > >
> > > diff --git a/devtools/libabigail.abignore
> > > b/devtools/libabigail.abignore index a59df8f13..cd86d89ca 100644
> > > --- a/devtools/libabigail.abignore
> > > +++ b/devtools/libabigail.abignore
> > > @@ -11,3 +11,10 @@
> > >          type_kind = enum
> > >          name = rte_crypto_asym_xform_type
> > >          changed_enumerators =
> RTE_CRYPTO_ASYM_XFORM_TYPE_LIST_END
> > > +; Ignore updates of ring prod/cons
> > > +[suppress_type]
> > > +        type_kind = struct
> > > +        name = rte_ring
> > > +[suppress_type]
> > > +        type_kind = struct
> > > +        name = rte_event_ring
> > Does this block the reporting of these structures forever?
> 
> Till we'll have a fix in libabigail, then we can remove these lines.
> I don't know any better alternative.
David, does this block all issues in the future for rte_ring library?

> 
> >
> > > diff --git a/lib/librte_ring/Makefile b/lib/librte_ring/Makefile
> > > index 917c560ad..8f5c284cc 100644
> > > --- a/lib/librte_ring/Makefile
> > > +++ b/lib/librte_ring/Makefile
> > > @@ -18,6 +18,9 @@ SRCS-$(CONFIG_RTE_LIBRTE_RING) := rte_ring.c
> > > SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include := rte_ring.h \
> > >                                   rte_ring_elem.h \
> > >                                   rte_ring_generic.h \
> > > -                                 rte_ring_c11_mem.h
> > > +                                 rte_ring_c11_mem.h \
> > > +                                 rte_ring_rts.h \
> > > +                                 rte_ring_rts_elem.h \
> > > +                                 rte_ring_rts_generic.h
> > >
> > >  include $(RTE_SDK)/mk/rte.lib.mk
> > > diff --git a/lib/librte_ring/meson.build
> > > b/lib/librte_ring/meson.build index f2f3ccc88..612936afb 100644
> > > --- a/lib/librte_ring/meson.build
> > > +++ b/lib/librte_ring/meson.build
> > > @@ -5,7 +5,10 @@ sources = files('rte_ring.c')  headers = 
> > > files('rte_ring.h',
> > >           'rte_ring_elem.h',
> > >           'rte_ring_c11_mem.h',
> > > -         'rte_ring_generic.h')
> > > +         'rte_ring_generic.h',
> > > +         'rte_ring_rts.h',
> > > +         'rte_ring_rts_elem.h',
> > > +         'rte_ring_rts_generic.h')
> > >
> > >  # rte_ring_create_elem and rte_ring_get_memsize_elem are
> > > experimental allow_experimental_apis = true diff --git
> > > a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c index
> > > fa5733907..222eec0fb 100644
> > > --- a/lib/librte_ring/rte_ring.c
> > > +++ b/lib/librte_ring/rte_ring.c
> > > @@ -45,6 +45,9 @@ EAL_REGISTER_TAILQ(rte_ring_tailq)
> > >  /* true if x is a power of 2 */
> > >  #define POWEROF2(x) ((((x)-1) & (x)) == 0)
> > >
> > > +/* by default set head/tail distance as 1/8 of ring capacity */
> > > +#define HTD_MAX_DEF      8
> > > +
> > >  /* return the size of memory occupied by a ring */  ssize_t
> > > rte_ring_get_memsize_elem(unsigned int esize, unsigned int count) @@
> > > -
> > > 79,11 +82,84 @@ rte_ring_get_memsize(unsigned int count)
> > >   return rte_ring_get_memsize_elem(sizeof(void *), count);  }
> > >
> > > +/*
> > > + * internal helper function to reset prod/cons head-tail values.
> > > + */
> > > +static void
> > > +reset_headtail(void *p)
> > > +{
> > > + struct rte_ring_headtail *ht;
> > > + struct rte_ring_rts_headtail *ht_rts;
> > > +
> > > + ht = p;
> > > + ht_rts = p;
> > > +
> > > + switch (ht->sync_type) {
> > > + case RTE_RING_SYNC_MT:
> > > + case RTE_RING_SYNC_ST:
> > > +         ht->head = 0;
> > > +         ht->tail = 0;
> > > +         break;
> > > + case RTE_RING_SYNC_MT_RTS:
> > > +         ht_rts->head.raw = 0;
> > > +         ht_rts->tail.raw = 0;
> > > +         break;
> > > + default:
> > > +         /* unknown sync mode */
> > > +         RTE_ASSERT(0);
> > > + }
> > > +}
> > > +
> > >  void
> > >  rte_ring_reset(struct rte_ring *r)
> > >  {
> > > - r->prod.head = r->cons.head = 0;
> > > - r->prod.tail = r->cons.tail = 0;
> > > + reset_headtail(&r->prod);
> > > + reset_headtail(&r->cons);
> > > +}
> > > +
> > > +/*
> > > + * helper function, calculates sync_type values for prod and cons
> > > + * based on input flags. Returns zero at success or negative
> > > + * errno value otherwise.
> > > + */
> > > +static int
> > > +get_sync_type(uint32_t flags, enum rte_ring_sync_type *prod_st,
> > > + enum rte_ring_sync_type *cons_st)
> > > +{
> > > + static const uint32_t prod_st_flags =
> > > +         (RING_F_SP_ENQ | RING_F_MP_RTS_ENQ);
> > > + static const uint32_t cons_st_flags =
> > > +         (RING_F_SC_DEQ | RING_F_MC_RTS_DEQ);
> > > +
> > > + switch (flags & prod_st_flags) {
> > > + case 0:
> > > +         *prod_st = RTE_RING_SYNC_MT;
> > > +         break;
> > > + case RING_F_SP_ENQ:
> > > +         *prod_st = RTE_RING_SYNC_ST;
> > > +         break;
> > > + case RING_F_MP_RTS_ENQ:
> > > +         *prod_st = RTE_RING_SYNC_MT_RTS;
> > > +         break;
> > > + default:
> > > +         return -EINVAL;
> > > + }
> > > +
> > > + switch (flags & cons_st_flags) {
> > > + case 0:
> > > +         *cons_st = RTE_RING_SYNC_MT;
> > > +         break;
> > > + case RING_F_SC_DEQ:
> > > +         *cons_st = RTE_RING_SYNC_ST;
> > > +         break;
> > > + case RING_F_MC_RTS_DEQ:
> > > +         *cons_st = RTE_RING_SYNC_MT_RTS;
> > > +         break;
> > > + default:
> > > +         return -EINVAL;
> > > + }
> > > +
> > > + return 0;
> > >  }
> > >
> > >  int
> > > @@ -100,16 +176,20 @@ rte_ring_init(struct rte_ring *r, const char
> > > *name, unsigned count,
> > >   RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
> > >                     RTE_CACHE_LINE_MASK) != 0);
> > >
> > > + RTE_BUILD_BUG_ON(offsetof(struct rte_ring_headtail, sync_type) !=
> > > +         offsetof(struct rte_ring_rts_headtail, sync_type));
> > > + RTE_BUILD_BUG_ON(offsetof(struct rte_ring_headtail, tail) !=
> > > +         offsetof(struct rte_ring_rts_headtail, tail.val.pos));
> > > +
> > >   /* init the ring structure */
> > >   memset(r, 0, sizeof(*r));
> > >   ret = strlcpy(r->name, name, sizeof(r->name));
> > >   if (ret < 0 || ret >= (int)sizeof(r->name))
> > >           return -ENAMETOOLONG;
> > >   r->flags = flags;
> > > - r->prod.sync_type = (flags & RING_F_SP_ENQ) ?
> > > -         RTE_RING_SYNC_ST : RTE_RING_SYNC_MT;
> > > - r->cons.sync_type = (flags & RING_F_SC_DEQ) ?
> > > -         RTE_RING_SYNC_ST : RTE_RING_SYNC_MT;
> > > + ret = get_sync_type(flags, &r->prod.sync_type, &r->cons.sync_type);
> > > + if (ret != 0)
> > > +         return ret;
> > >
> > >   if (flags & RING_F_EXACT_SZ) {
> > >           r->size = rte_align32pow2(count + 1); @@ -126,8 +206,12
> @@
> > > rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
> > >           r->mask = count - 1;
> > >           r->capacity = r->mask;
> > >   }
> > > - r->prod.head = r->cons.head = 0;
> > > - r->prod.tail = r->cons.tail = 0;
> > > +
> > > + /* set default values for head-tail distance */
> > > + if (flags & RING_F_MP_RTS_ENQ)
> > > +         rte_ring_set_prod_htd_max(r, r->capacity / HTD_MAX_DEF);
> > > + if (flags & RING_F_MC_RTS_DEQ)
> > > +         rte_ring_set_cons_htd_max(r, r->capacity / HTD_MAX_DEF);
> > >
> > >   return 0;
> > >  }
> > > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> > > index
> > > d4775a063..f6f084d79 100644
> > > --- a/lib/librte_ring/rte_ring.h
> > > +++ b/lib/librte_ring/rte_ring.h
> > > @@ -48,6 +48,7 @@ extern "C" {
> > >  #include <rte_branch_prediction.h>
> > >  #include <rte_memzone.h>
> > >  #include <rte_pause.h>
> > > +#include <rte_debug.h>
> > >
> > >  #define RTE_TAILQ_RING_NAME "RTE_RING"
> > >
> > > @@ -65,10 +66,13 @@ enum rte_ring_queue_behavior {  enum
> > > rte_ring_sync_type {
> > >   RTE_RING_SYNC_MT,     /**< multi-thread safe (default mode) */
> > >   RTE_RING_SYNC_ST,     /**< single thread only */
> > > +#ifdef ALLOW_EXPERIMENTAL_API
> > > + RTE_RING_SYNC_MT_RTS, /**< multi-thread relaxed tail sync */
> > > #endif
> > >  };
> > >
> > >  /**
> > > - * structure to hold a pair of head/tail values and other metadata.
> > > + * structures to hold a pair of head/tail values and other metadata.
> > >   * Depending on sync_type format of that structure might be different,
> > >   * but offset for *sync_type* and *tail* values should remain the same.
> > >   */
> > > @@ -84,6 +88,21 @@ struct rte_ring_headtail {
> > >   };
> > >  };
> > >
> > > +union rte_ring_ht_poscnt {
> > nit, this is specific to RTS, may be change this to rte_ring_rts_ht_poscnt?
> 
> Ok.
> 
> >
> > > + uint64_t raw;
> > > + struct {
> > > +         uint32_t cnt; /**< head/tail reference counter */
> > > +         uint32_t pos; /**< head/tail position */
> > > + } val;
> > > +};
> > > +
> > > +struct rte_ring_rts_headtail {
> > > + volatile union rte_ring_ht_poscnt tail;
> > > + enum rte_ring_sync_type sync_type;  /**< sync type of prod/cons */
> > > + uint32_t htd_max;   /**< max allowed distance between head/tail */
> > > + volatile union rte_ring_ht_poscnt head; };
> > > +
> > >  /**
> > >   * An RTE ring structure.
> > >   *
> > > @@ -111,11 +130,21 @@ struct rte_ring {
> > >   char pad0 __rte_cache_aligned; /**< empty cache line */
> > >
> > >   /** Ring producer status. */
> > > - struct rte_ring_headtail prod __rte_cache_aligned;
> > > + RTE_STD_C11
> > > + union {
> > > +         struct rte_ring_headtail prod;
> > > +         struct rte_ring_rts_headtail rts_prod;
> > > + }  __rte_cache_aligned;
> > > +
> > >   char pad1 __rte_cache_aligned; /**< empty cache line */
> > >
> > >   /** Ring consumer status. */
> > > - struct rte_ring_headtail cons __rte_cache_aligned;
> > > + RTE_STD_C11
> > > + union {
> > > +         struct rte_ring_headtail cons;
> > > +         struct rte_ring_rts_headtail rts_cons;
> > > + }  __rte_cache_aligned;
> > > +
> > >   char pad2 __rte_cache_aligned; /**< empty cache line */  };
> > >
> > > @@ -132,6 +161,9 @@ struct rte_ring {  #define RING_F_EXACT_SZ
> > > 0x0004  #define RTE_RING_SZ_MASK  (0x7fffffffU) /**< Ring size mask
> > > */
> > >
> > > +#define RING_F_MP_RTS_ENQ 0x0008 /**< The default enqueue is "MP
> RTS".
> > > +*/ #define RING_F_MC_RTS_DEQ 0x0010 /**< The default dequeue is
> "MC
> > > +RTS". */
> > > +
> > >  #define __IS_SP RTE_RING_SYNC_ST
> > >  #define __IS_MP RTE_RING_SYNC_MT
> > >  #define __IS_SC RTE_RING_SYNC_ST
> > > @@ -461,6 +493,10 @@ rte_ring_sp_enqueue_bulk(struct rte_ring *r,
> > > void * const *obj_table,
> > >                   RTE_RING_SYNC_ST, free_space);
> > >  }
> > >
> > > +#ifdef ALLOW_EXPERIMENTAL_API
> > > +#include <rte_ring_rts.h>
> > > +#endif
> > > +
> > >  /**
> > >   * Enqueue several objects on a ring.
> > >   *
> > > @@ -484,8 +520,21 @@ static __rte_always_inline unsigned int
> > > rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
> > >                 unsigned int n, unsigned int *free_space)  {
> > > - return __rte_ring_do_enqueue(r, obj_table, n,
> > > RTE_RING_QUEUE_FIXED,
> > > -                 r->prod.sync_type, free_space);
> > > + switch (r->prod.sync_type) {
> > > + case RTE_RING_SYNC_MT:
> > > +         return rte_ring_mp_enqueue_bulk(r, obj_table, n, free_space);
> > > + case RTE_RING_SYNC_ST:
> > > +         return rte_ring_sp_enqueue_bulk(r, obj_table, n, free_space);
> > Have you validated if these affect the performance for the existing APIs?
> 
> I run ring_pmd_perf_autotest
> (AFAIK, that's the only one of our perf tests that calls
> rte_ring_enqueue/dequeue), and didn't see any real difference in perf
> numbers.
> 
> > I am also wondering why should we support these new modes in the legacy
> APIs?
> 
> Majority of DPDK users still do use legacy API, and I am not sure all of them
> will be happy to switch to _elem_ one manually.
> Plus I can't see how we can justify that after let say:
> rte_ring_init(ring, ..., RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ); returns
> with success valid call to rte_ring_enqueue(ring,...) should fail.
Agree, I think the only way right now is through documentation.

> 
> > I think users should move to use rte_ring_xxx_elem APIs. If users want to
> use RTS/HTS it will be a good time for them to move to new APIs.
> 
> If they use rte_ring_enqueue/dequeue all they have to do - just change flags
> in ring_create/ring_init call.
> With what you suggest - they have to change every
> rte_ring_enqueue/dequeue to rte_ring_elem_enqueue/dequeue.
> That's much bigger code churn.
But these are just 1 to 1 mapped.  I would think, there are not a whole lot of 
them in the application code, may be ~10 lines?
I think the bigger factor for the user here is the algorithm changes in 
rte_ring library. Bigger effort for the users would be testing rather than code 
changes in the applications. 

> 
> > They anyway have to test their code for RTS/HTS, might as well make the
> change to new APIs and test both.
> > It will be less code to maintain for the community as well.
> 
> That's true, right now there is a lot of duplication between _elem_ and legacy
> code.
>  Actually the only real diff between them - actual copying of the objects.
>  But I thought we are going to deal with that, just by changing one day all
> legacy API to wrappers around _elem_ calls, i.e something like:
> 
> static __rte_always_inline unsigned int
> rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
>                       unsigned int n, unsigned int *free_space) {
>       return  rte_ring_enqueue_elem_bulk(r, obj_table, sizeof(uintptr_t), n,
> free_space); }
> 
> That way users will switch to new API automatically, without any extra effort
> for them, and we will be able to remove legacy code.
> Do you have some other thoughts here how to deal with this legacy/elem
> conversion?
Yes, that is what I was thinking, but had not considered any addition of new 
APIs.
But, I am wondering if we should look at deprecation? If we decide to 
deprecate, it would be good to avoid making the users of RTS/HTS do the work 
twice (once to make the switch to RTS/HTS and then another to _elem_ APIs).

One thing we can do is to implement the wrappers you mentioned above for 
RTS/HTS now. I also it is worth considering to switch to these wrappers 20.05 
so that come 20.11, we have a code base that has gone through couple of 
releases' testing.
 
> 
> >
> > > #ifdef
> > > +ALLOW_EXPERIMENTAL_API
> > > + case RTE_RING_SYNC_MT_RTS:
> > > +         return rte_ring_mp_rts_enqueue_bulk(r, obj_table, n,
> > > +                 free_space);
> > > +#endif
> > > + }
> > > +
> > > + /* valid ring should never reach this point */
> > > + RTE_ASSERT(0);
> > > + return 0;
> > >  }
> > >

<snip>

> > >
> > >  #ifdef __cplusplus
> > > diff --git a/lib/librte_ring/rte_ring_elem.h
> > > b/lib/librte_ring/rte_ring_elem.h index 28f9836e6..5de0850dc 100644
> > > --- a/lib/librte_ring/rte_ring_elem.h
> > > +++ b/lib/librte_ring/rte_ring_elem.h
> > > @@ -542,6 +542,8 @@ rte_ring_sp_enqueue_bulk_elem(struct rte_ring
> > > *r, const void *obj_table,
> > >                   RTE_RING_QUEUE_FIXED, __IS_SP, free_space);  }
> > >
> > > +#include <rte_ring_rts_elem.h>

<snip>

> > >
> > >  #ifdef __cplusplus
> > > diff --git a/lib/librte_ring/rte_ring_rts.h
> > > b/lib/librte_ring/rte_ring_rts.h new file mode 100644 index
> > > 000000000..18404fe48
> > > --- /dev/null
> > > +++ b/lib/librte_ring/rte_ring_rts.h
> > IMO, we should not provide these APIs.
> 
> You mean only _elem_ ones, as discussed above?
Yes

> 
> >
> > > @@ -0,0 +1,316 @@
> > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > + *
> > > + * Copyright (c) 2010-2017 Intel Corporation
> > nit, the year should change to 2020? Look at others too.
> 
> ack, will do.
> 
> >
> > > + * Copyright (c) 2007-2009 Kip Macy km...@freebsd.org
> > > + * All rights reserved.
> > > + * Derived from FreeBSD's bufring.h
> > > + * Used as BSD-3 Licensed with permission from Kip Macy.
> > > + */
> > > +
> > > +#ifndef _RTE_RING_RTS_H_
> > > +#define _RTE_RING_RTS_H_
> > > +
> > > +/**
> > > + * @file rte_ring_rts.h
> > > + * @b EXPERIMENTAL: this API may change without prior notice
> > > + * It is not recommended to include this file directly.
> > > + * Please include <rte_ring.h> instead.
> > > + *
> > > + * Contains functions for Relaxed Tail Sync (RTS) ring mode.
> > > + * The main idea remains the same as for our original MP/MC
> >
> > ^^^ the
> > > +synchronization
> > > + * mechanism.
> > > + * The main difference is that tail value is increased not
> > > + * by every thread that finished enqueue/dequeue,
> > > + * but only by the last one doing enqueue/dequeue.
> > should we say 'current last' or 'last thread at a given instance'?
> >
> > > + * That allows threads to skip spinning on tail value,
> > > + * leaving actual tail value change to last thread in the update queue.
> > nit, I understand what you mean by 'update queue' here. IMO, we should
> remove it as it might confuse some.
> >
> > > + * RTS requires 2 64-bit CAS for each enqueue(/dequeue) operation:
> > > + * one for head update, second for tail update.
> > > + * As a gain it allows thread to avoid spinning/waiting on tail value.
> > > + * In comparision original MP/MC algorithm requires one 32-bit CAS
> > > + * for head update and waiting/spinning on tail value.
> > > + *
> > > + * Brief outline:
> > > + *  - introduce refcnt for both head and tail.
> > Suggesting using the same names as used in the structures.
> >
> > > + *  - increment head.refcnt for each head.value update
> > > + *  - write head:value and head:refcnt atomically (64-bit CAS)
> > > + *  - move tail.value ahead only when tail.refcnt + 1 ==
> > > + head.refcnt
> > May be add '(indicating that this is the last thread updating the tail)'
> >
> > > + *  - increment tail.refcnt when each enqueue/dequeue op finishes
> > May be add 'otherwise' at the beginning.
> >
> > > + *    (no matter is tail:value going to change or not)
> > nit                            ^^ if
> > > + *  - write tail.value and tail.recnt atomically (64-bit CAS)
> > > + *
> > > + * To avoid producer/consumer starvation:
> > > + *  - limit max allowed distance between head and tail value (HTD_MAX).
> > > + *    I.E. thread is allowed to proceed with changing head.value,
> > > + *    only when:  head.value - tail.value <= HTD_MAX
> > > + * HTD_MAX is an optional parameter.
> > > + * With HTD_MAX == 0 we'll have fully serialized ring -
> > > + * i.e. only one thread at a time will be able to enqueue/dequeue
> > > + * to/from the ring.
> > > + * With HTD_MAX >= ring.capacity - no limitation.
> > > + * By default HTD_MAX == ring.capacity / 8.
> > > + */
> > > +
> > > +#ifdef __cplusplus
> > > +extern "C" {
> > > +#endif
> > > +
> > > +#include <rte_ring_rts_generic.h>
> > > +

<snip>

> > > +
> > > +#endif /* _RTE_RING_RTS_H_ */
> > > diff --git a/lib/librte_ring/rte_ring_rts_elem.h
> > > b/lib/librte_ring/rte_ring_rts_elem.h
> > > new file mode 100644
> > > index 000000000..71a331b23
> > > --- /dev/null
> > > +++ b/lib/librte_ring/rte_ring_rts_elem.h
> > > @@ -0,0 +1,205 @@
> > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > + *
> > > + * Copyright (c) 2010-2017 Intel Corporation
> > > + * Copyright (c) 2007-2009 Kip Macy km...@freebsd.org
> > > + * All rights reserved.
> > > + * Derived from FreeBSD's bufring.h
> > > + * Used as BSD-3 Licensed with permission from Kip Macy.
> > > + */
> > > +
> > > +#ifndef _RTE_RING_RTS_ELEM_H_
> > > +#define _RTE_RING_RTS_ELEM_H_
> > > +
> > > +/**
> > > + * @file rte_ring_rts_elem.h
> > > + * @b EXPERIMENTAL: this API may change without prior notice
> > > + *
> > > + * It is not recommended to include this file directly.
> > > + * Please include <rte_ring_elem.h> instead.
> > > + * Contains *ring_elem* functions for Relaxed Tail Sync (RTS) ring mode.
> > > + * for more details please refer to <rte_ring_rts.h>.
> > > + */
> > > +
> > > +#ifdef __cplusplus
> > > +extern "C" {
> > > +#endif
> > > +
> > > +#include <rte_ring_rts_generic.h>
> > > +
> > > +/**
> > > + * @internal Enqueue several objects on the RTS ring.
> > > + *
> > > + * @param r
> > > + *   A pointer to the ring structure.
> > > + * @param obj_table
> > > + *   A pointer to a table of void * pointers (objects).
> > > + * @param n
> > > + *   The number of objects to add in the ring from the obj_table.
> > > + * @param behavior
> > > + *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a
> ring
> > > + *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible
> from
> > > ring
> > > + * @param free_space
> > > + *   returns the amount of space after the enqueue operation has finished
> > > + * @return
> > > + *   Actual number of objects enqueued.
> > > + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> > > + */
> > > +static __rte_always_inline unsigned int
> > > +__rte_ring_do_rts_enqueue_elem(struct rte_ring *r, void * const
> > > +*obj_table,
> > obj_table should be of type 'const void * obj_table' (looks like copy paste
> error). Please check the other APIs below too.
> >
> > > + uint32_t esize, uint32_t n, enum rte_ring_queue_behavior behavior,
> > 'esize' is not documented in the comments above the function. You can
> > copy the header from rte_ring_elem.h file. Please check other APIs as well.
> 
> Ack to both, will fix.
> 
> >
> > > + uint32_t *free_space)
> > > +{
> > > + uint32_t free, head;
> > > +
> > > + n =  __rte_ring_rts_move_prod_head(r, n, behavior, &head, &free);
> > > +
> > > + if (n != 0) {
> > > +         __rte_ring_enqueue_elems(r, head, obj_table, esize, n);
> > > +         __rte_ring_rts_update_tail(&r->rts_prod);
> > > + }
> > > +
> > > + if (free_space != NULL)
> > > +         *free_space = free - n;
> > > + return n;
> > > +}
> > > +

<snip>

> > > +
> > > +#endif /* _RTE_RING_RTS_ELEM_H_ */
> > > diff --git a/lib/librte_ring/rte_ring_rts_generic.h
> > > b/lib/librte_ring/rte_ring_rts_generic.h
> > > new file mode 100644
> > > index 000000000..f88460d47
> > > --- /dev/null
> > > +++ b/lib/librte_ring/rte_ring_rts_generic.h
> > I do not know the benefit to providing the generic version. Do you know
> why this was done in the legacy APIs?
> 
> I think at first we had generic API only, then later C11 was added.
> As I remember, C11 one on IA was measured as a bit slower then generic,
> so it was decided to keep both.
> 
> > If there is no performance difference between generic and C11 versions,
> should we just skip the generic version?
> > The oldest compiler in CI are GCC 4.8.5 and Clang 3.4.2 and C11 built-ins
> are supported earlier than these compiler versions.
> > I feel the code is growing exponentially in rte_ring library and we should 
> > try
> to cut non-value-ass code/APIs aggressively.
> 
> I'll check is there perf difference for RTS and HTS between generic and C11
> versions on IA.
> Meanwhile please have a proper look at C11 implementation, I am not that
> familiar with C11 atomics yet.
ok

> If there would be no problems with it and no noticeable diff in performance -
> I am ok to have for RTS/HTS modes C11 version only.
> 
> >
> > > @@ -0,0 +1,210 @@
> > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > + *
> > > + * Copyright (c) 2010-2017 Intel Corporation
> > > + * Copyright (c) 2007-2009 Kip Macy km...@freebsd.org
> > > + * All rights reserved.
> > > + * Derived from FreeBSD's bufring.h
> > > + * Used as BSD-3 Licensed with permission from Kip Macy.
> > > + */
> > > +
> > > +#ifndef _RTE_RING_RTS_GENERIC_H_
> > > +#define _RTE_RING_RTS_GENERIC_H_
> > > +
> > > +/**
> > > + * @file rte_ring_rts_generic.h
> > > + * It is not recommended to include this file directly,
> > > + * include <rte_ring.h> instead.
> > > + * Contains internal helper functions for Relaxed Tail Sync (RTS) ring
> mode.
> > > + * For more information please refer to <rte_ring_rts.h>.
> > > + */

<snip>

Reply via email to