Hi Honnappa,

On Sun, Oct 20, 2019 at 07:22:54PM -0500, Honnappa Nagarahalli wrote:
> The current rte_ring hard-codes the type of the ring element to 'void *',
> hence the size of the element is hard-coded to 32b/64b. Since the ring
> element type is not an input to rte_ring APIs, it results in couple
> of issues:
> 
> 1) If an application requires to store an element which is not 64b, it
>    needs to write its own ring APIs similar to rte_event_ring APIs. This
>    creates additional burden on the programmers, who end up making
>    work-arounds and often waste memory.
> 2) If there are multiple libraries that store elements of the same
>    type, currently they would have to write their own rte_ring APIs. This
>    results in code duplication.
> 
> This patch adds new APIs to support configurable ring element size.
> The APIs support custom element sizes by allowing to define the ring
> element to be a multiple of 32b.
> 
> The aim is to achieve same performance as the existing ring
> implementation. The patch adds same performance tests that are run
> for existing APIs. This allows for performance comparison.
> 
> I also tested with memcpy. x86 shows significant improvements on bulk
> and burst tests. On the Arm platform, I used, there is a drop of
> 4% to 6% in few tests. May be this is something that we can explore
> later.
> 
> Note that this version skips changes to other libraries as I would
> like to get an agreement on the implementation from the community.
> They will be added once there is agreement on the rte_ring changes.
> 
> v6
>  - Labelled as RFC to indicate the better status
>  - Added unit tests to test the rte_ring_xxx_elem APIs
>  - Corrected 'macro based partial memcpy' (5/6) patch
>  - Added Konstantin's method after correction (6/6)
>  - Check Patch shows significant warnings and errors mainly due
>    copying code from existing test cases. None of them are harmful.
>    I will fix them once we have an agreement.
> 
> v5
>  - Use memcpy for chunks of 32B (Konstantin).
>  - Both 'ring_perf_autotest' and 'ring_perf_elem_autotest' are available
>    to compare the results easily.
>  - Copying without memcpy is also available in 1/3, if anyone wants to
>    experiment on their platform.
>  - Added other platform owners to test on their respective platforms.
> 
> v4
>  - Few fixes after more performance testing
> 
> v3
>  - Removed macro-fest and used inline functions
>    (Stephen, Bruce)
> 
> v2
>  - Change Event Ring implementation to use ring templates
>    (Jerin, Pavan)
> 
> Honnappa Nagarahalli (6):
>   test/ring: use division for cycle count calculation
>   lib/ring: apis to support configurable element size
>   test/ring: add functional tests for configurable element size ring
>   test/ring: add perf tests for configurable element size ring
>   lib/ring: copy ring elements using memcpy partially
>   lib/ring: improved copy function to copy ring elements
> 
>  app/test/Makefile                    |   2 +
>  app/test/meson.build                 |   2 +
>  app/test/test_ring_elem.c            | 859 +++++++++++++++++++++++++++
>  app/test/test_ring_perf.c            |  22 +-
>  app/test/test_ring_perf_elem.c       | 419 +++++++++++++
>  lib/librte_ring/Makefile             |   3 +-
>  lib/librte_ring/meson.build          |   4 +
>  lib/librte_ring/rte_ring.c           |  34 +-
>  lib/librte_ring/rte_ring.h           |   1 +
>  lib/librte_ring/rte_ring_elem.h      | 818 +++++++++++++++++++++++++
>  lib/librte_ring/rte_ring_version.map |   2 +
>  11 files changed, 2147 insertions(+), 19 deletions(-)
>  create mode 100644 app/test/test_ring_elem.c
>  create mode 100644 app/test/test_ring_perf_elem.c
>  create mode 100644 lib/librte_ring/rte_ring_elem.h

Sorry, I come a day after the fair.

I have only few comments on the shape (I'll reply to individual
patches). On the substance, it looks good to me. I also feel this
version is much better than the template-based versions.

Thanks
Olivier

Reply via email to