> -----Original Message----- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zhihong Wang > Sent: Thursday, January 29, 2015 10:39 AM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH v2 0/4] DPDK memcpy optimization > > This patch set optimizes memcpy for DPDK for both SSE and AVX platforms. > It also extends memcpy test coverage with unaligned cases and more test > points. > > Optimization techniques are summarized below: > > 1. Utilize full cache bandwidth > > 2. Enforce aligned stores > > 3. Apply load address alignment based on architecture features > > 4. Make load/store address available as early as possible > > 5. General optimization techniques like inlining, branch reducing, prefetch > pattern access > > -------------- > Changes in v2: > > 1. Reduced constant test cases in app/test/test_memcpy_perf.c for fast build > > 2. Modified macro definition for better code readability & safety > > Zhihong Wang (4): > app/test: Disabled VTA for memcpy test in app/test/Makefile > app/test: Removed unnecessary test cases in app/test/test_memcpy.c > app/test: Extended test coverage in app/test/test_memcpy_perf.c > lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE > and AVX platforms > > app/test/Makefile | 6 + > app/test/test_memcpy.c | 52 +- > app/test/test_memcpy_perf.c | 220 ++++--- > .../common/include/arch/x86/rte_memcpy.h | 680 +++++++++++++++----- > - > 4 files changed, 654 insertions(+), 304 deletions(-) > > -- > 1.9.3
Acked-by: Cunming Liang <cunming.liang at intel.com>