On 4/14/2021 12:36 AM, Jerin Jacob wrote:
On Thu, Mar 18, 2021 at 3:56 PM Ruifeng Wang <ruifeng.w...@arm.com> wrote:
There are some holes in data struct lcore_conf. The holes are
due to alignment requirement.
For struct lcore_rx_queue, there is no need to make every element
of this type to be cache line aligned, because the data is not
shared between cores.
Member len of struct mbuf_table can be moved out. So data can be
packed and there will be no need to load an extra cache line when
mbuf table is empty.
The change showed slight performance improvement on N1SDP platform.
Suggested-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
Signed-off-by: Ruifeng Wang <ruifeng.w...@arm.com>
This change alone is OK in the octeontx2 platform.(No difference in performance)
combining with 3/4 shows some regression. Probably is due to prefetch
or 128B cache line tuning specifics.
We checked it on Layerscape LS2088A platform. No difference for 1-2 core
case. However observing ~2% regression for 4-8 cores.
Regards,
Hemant