> From: Paul Szczepanek [mailto:paul.szczepa...@arm.com] > Sent: Tuesday, 31 October 2023 19.11
[...] > Test is added that shows potential performance gain from compression. > In > this test an array of pointers is passed through a ring between two > cores. > It shows the gain which is dependent on the bulk operation size. In > this > synthetic test run on ampere altra a substantial (up to 25%) > performance > gain is seen if done in bulk size larger than 32. At 32 it breaks even > and > lower sizes create a small (less than 5%) slowdown due to overhead. > > In a more realistic mock application running the l3 forwarding dpdk > example that works in pipeline mode this translated into a ~5% > throughput > increase on an ampere altra. What was the bulk size in this test? And were the pipeline stages running on the same lcore or individual lcores per pipeline stage?