Basic Information Patch name DPDK memcpy optimization v2 Brief description about test purpose Verify memory copy and memory copy performance cases on variety OS Test Flag Tested-by Tester name jingguox.fu at intel.com
Test Tool Chain information N/A Commit ID 88fa98a60b34812bfed92e5b2706fcf7e1cbcbc8 Test Result Summary Total 6 cases, 6 passed, 0 failed Test environment - Environment 1: OS: Ubuntu12.04 3.2.0-23-generic X86_64 GCC: gcc version 4.6.3 CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ [8086:10fb] (rev 01) - Environment 2: OS: Ubuntu14.04 3.13.0-24-generic GCC: gcc version 4.8.2 CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ [8086:10fb] (rev 01) Environment 3: OS: Fedora18 3.6.10-4.fc18.x86_64 GCC: gcc version 4.7.2 20121109 CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ [8086:10fb] (rev 01) Detailed Testing information Test Case - name test_memcpy Test Case - Description Create two buffers, and initialise one with random values. These are copied to the second buffer and then compared to see if the copy was successful. The bytes outside the copied area are also checked to make sure they were not changed. Test Case -test sample/application test application in app/test Test Case -command / instruction # ./app/test/test -n 1 -c ffff #RTE>> memcpy_autotest Test Case - expected #RTE>> Test OK Test Result- PASSED Test Case - name test_memcpy_perf Test Case - Description a number of different sizes and cached/uncached permutations Test Case -test sample/application test application in app/test Test Case -command / instruction # ./app/test/test -n 1 -c ffff #RTE>> memcpy_perf_autotest Test Case - expected #RTE>> Test OK Test Result- PASSED -----Original Message----- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Zhihong Wang Sent: Thursday, January 29, 2015 10:39 To: dev at dpdk.org Subject: [dpdk-dev] [PATCH v2 0/4] DPDK memcpy optimization This patch set optimizes memcpy for DPDK for both SSE and AVX platforms. It also extends memcpy test coverage with unaligned cases and more test points. Optimization techniques are summarized below: 1. Utilize full cache bandwidth 2. Enforce aligned stores 3. Apply load address alignment based on architecture features 4. Make load/store address available as early as possible 5. General optimization techniques like inlining, branch reducing, prefetch pattern access -------------- Changes in v2: 1. Reduced constant test cases in app/test/test_memcpy_perf.c for fast build 2. Modified macro definition for better code readability & safety Zhihong Wang (4): app/test: Disabled VTA for memcpy test in app/test/Makefile app/test: Removed unnecessary test cases in app/test/test_memcpy.c app/test: Extended test coverage in app/test/test_memcpy_perf.c lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE and AVX platforms app/test/Makefile | 6 + app/test/test_memcpy.c | 52 +- app/test/test_memcpy_perf.c | 220 ++++--- .../common/include/arch/x86/rte_memcpy.h | 680 +++++++++++++++------ 4 files changed, 654 insertions(+), 304 deletions(-) -- 1.9.3