<snip> > Subject: RE: [PATCH v3 3/9] ring: introduce RTS ring mode > > > > > > > > > > > +#ifdef ALLOW_EXPERIMENTAL_API > > > > > +#include <rte_ring_rts.h> > > > > > +#endif > > > > > + > > > > > /** > > > > > * Enqueue several objects on a ring. > > > > > * > > > > > @@ -484,8 +520,21 @@ static __rte_always_inline unsigned int > > > > > rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table, > > > > > unsigned int n, unsigned int *free_space) { > > > > > - return __rte_ring_do_enqueue(r, obj_table, n, > > > > > RTE_RING_QUEUE_FIXED, > > > > > - r->prod.sync_type, free_space); > > > > > + switch (r->prod.sync_type) { > > > > > + case RTE_RING_SYNC_MT: > > > > > + return rte_ring_mp_enqueue_bulk(r, obj_table, n, > free_space); > > > > > + case RTE_RING_SYNC_ST: > > > > > + return rte_ring_sp_enqueue_bulk(r, obj_table, n, > free_space); > > > > Have you validated if these affect the performance for the existing > > > > APIs? > > > > > > I run ring_pmd_perf_autotest > > > (AFAIK, that's the only one of our perf tests that calls > > > rte_ring_enqueue/dequeue), and didn't see any real difference in > > > perf numbers. > > > > > > > I am also wondering why should we support these new modes in the > > > > legacy > > > APIs? > > > > > > Majority of DPDK users still do use legacy API, and I am not sure > > > all of them will be happy to switch to _elem_ one manually. > > > Plus I can't see how we can justify that after let say: > > > rte_ring_init(ring, ..., RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ); > > > returns with success valid call to rte_ring_enqueue(ring,...) should fail. > > Agree, I think the only way right now is through documentation. > > > > > > > > > I think users should move to use rte_ring_xxx_elem APIs. If users > > > > want to > > > use RTS/HTS it will be a good time for them to move to new APIs. > > > > > > If they use rte_ring_enqueue/dequeue all they have to do - just > > > change flags in ring_create/ring_init call. > > > With what you suggest - they have to change every > > > rte_ring_enqueue/dequeue to rte_ring_elem_enqueue/dequeue. > > > That's much bigger code churn. > > But these are just 1 to 1 mapped. I would think, there are not a whole lot > > of > them in the application code, may be ~10 lines? > > I suppose it depends a lot on particular user app. > My preference not to force users to do extra changes in their code. > If we can add new functionality while keeping existing API, why not to do it? > Less disturbance for users seems a good thing to me. > > > I think the bigger factor for the user here is the algorithm changes > > in rte_ring library. Bigger effort for the users would be testing rather > > than > code changes in the applications. > > > > > > > They anyway have to test their code for RTS/HTS, might as well > > > > make the > > > change to new APIs and test both. > > > > It will be less code to maintain for the community as well. > > > > > > That's true, right now there is a lot of duplication between _elem_ > > > and legacy code. > > > Actually the only real diff between them - actual copying of the objects. > > > But I thought we are going to deal with that, just by changing one > > > day all legacy API to wrappers around _elem_ calls, i.e something like: > > > > > > static __rte_always_inline unsigned int rte_ring_enqueue_bulk(struct > > > rte_ring *r, void * const *obj_table, > > > unsigned int n, unsigned int *free_space) { > > > return rte_ring_enqueue_elem_bulk(r, obj_table, sizeof(uintptr_t), > > > n, free_space); } > > > > > > That way users will switch to new API automatically, without any > > > extra effort for them, and we will be able to remove legacy code. > > > Do you have some other thoughts here how to deal with this > > > legacy/elem conversion? > > Yes, that is what I was thinking, but had not considered any addition of new > APIs. > > But, I am wondering if we should look at deprecation? > > You mean to deprecate existing legacy API? > rte_ring_enqueue/dequeue_bulk, etc? > I don't think we need to deprecate it at all. > As long as we'll have _elem_ functions called underneath there would be one > implementation anyway, and we can leave them forever, so users wouldn't > need to change their existing code at all. Ack (assuming that the legacy APIs will be wrappers)
> > > If we decide to deprecate, it would be good to avoid making the users > > of RTS/HTS do the work twice (once to make the switch to RTS/HTS and > then another to _elem_ APIs). > > > > One thing we can do is to implement the wrappers you mentioned above > for RTS/HTS now. > > That's a very good point. > It will require some re-org to allow rte_ring.h to include rte_ring_elem.h, > but > I think it is doable, will try to make these changes in v4. > > > I also it is worth considering to switch to these wrappers 20.05 so > > that come 20.11, we have a code base that has gone through couple of > releases' testing. > > You mean wrappers for existing legacy API (MP/MC, SP/SC modes)? > It is probably too late to make such changes in 20.05, probably early 20.08 is > a good time for that. Yes, will target for 20.08 > > > > > <snip> > > > > > > > + > > > > > +#endif /* _RTE_RING_RTS_ELEM_H_ */ > > > > > diff --git a/lib/librte_ring/rte_ring_rts_generic.h > > > > > b/lib/librte_ring/rte_ring_rts_generic.h > > > > > new file mode 100644 > > > > > index 000000000..f88460d47 > > > > > --- /dev/null > > > > > +++ b/lib/librte_ring/rte_ring_rts_generic.h > > > > I do not know the benefit to providing the generic version. Do you > > > > know > > > why this was done in the legacy APIs? > > > > > > I think at first we had generic API only, then later C11 was added. > > > As I remember, C11 one on IA was measured as a bit slower then > > > generic, so it was decided to keep both. > > > > > > > If there is no performance difference between generic and C11 > > > > versions, > > > should we just skip the generic version? > > > > The oldest compiler in CI are GCC 4.8.5 and Clang 3.4.2 and C11 > > > > built-ins > > > are supported earlier than these compiler versions. > > > > I feel the code is growing exponentially in rte_ring library and > > > > we should try > > > to cut non-value-ass code/APIs aggressively. > > > > > > I'll check is there perf difference for RTS and HTS between generic > > > and C11 versions on IA. > > > Meanwhile please have a proper look at C11 implementation, I am not > > > that familiar with C11 atomics yet. > > ok > > > > > If there would be no problems with it and no noticeable diff in > > > performance - I am ok to have for RTS/HTS modes C11 version only. > > From what I see on my box, there is no much difference in terms of > performance between *generic* and *c11_mem* for RTS/HTS. > ring_stress_autotest for majority of cases shows ~1% diff, in some cases c11 > numbers are even a bit better. > So will keep c11 version only in v4. Thanks. That will remove good amount of code.