> > This patch set optimizes memcpy for DPDK for both SSE and AVX platforms. > > It also extends memcpy test coverage with unaligned cases and more test > > points. > > > > Optimization techniques are summarized below: > > > > 1. Utilize full cache bandwidth > > > > 2. Enforce aligned stores > > > > 3. Apply load address alignment based on architecture features > > > > 4. Make load/store address available as early as possible > > > > 5. General optimization techniques like inlining, branch reducing, prefetch > > pattern access > > > > -------------- > > Changes in v2: > > > > 1. Reduced constant test cases in app/test/test_memcpy_perf.c for fast > > build > > > > 2. Modified macro definition for better code readability & safety > > > > Zhihong Wang (4): > > app/test: Disabled VTA for memcpy test in app/test/Makefile > > app/test: Removed unnecessary test cases in app/test/test_memcpy.c > > app/test: Extended test coverage in app/test/test_memcpy_perf.c > > lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE > > and AVX platforms > > Acked-by: Pablo de Lara <pablo.de.lara.guarch at intel.com>
Applied, thanks for the great work! Note: we are still looking for a maintainer of x86 EAL.