l3fwd: add neon support for l3fwd

Sekhar, Ashwin Tue, 02 May 2017 04:47:39 -0700

Hi Jianbo,

I tested your neon changes on thunderx. I am seeing a performance
regression of ~10% for LPM case and ~20% for EM case with your changes.
Did you see improvement on any arm64 platform with these changes. If
yes, how much was the improvement?


FYI, I had also tried vectorizing the l3fwd app with neon. Few of the
optimizations that I can suggest that helped in my case.

* Packet data prefetch is missing in the x86 sse version compared to
the scalar version (l3fwd_lpm_send_packets vs
l3fwd_lpm_no_opt_send_packets) . I couldn't understand why this was not
done in x86. But adding the prefetch was improving performance for
thunderx.

* Offsets to some packet elements like eth_hdr, ip header, packet type
etc. are recalculated in different functions. Calculating them once,
caching them and passing them directly to different functions was
improving performance.

* There are 3 different loops in l3fwd_lpm_send_packets where we
iterate over the packets. One each for processx4_step1 and
processx4_step2 and one in send_packets_multi. Unifying these loops
were also helping.

Thanks and Regards
Ashwin

Re: [dpdk-dev] [PATCH 5/5] examples/l3fwd: add neon support for l3fwd

Reply via email to