Fixed strict-aliasing rules breaking errors for some GCC version.
Signed-off-by: Zhihong Wang
---
.../common/include/arch/x86/rte_memcpy.h | 44 --
1 file changed, 24 insertions(+), 20 deletions(-)
diff --git a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h
b
Main code changes:
1. Differentiate architectural features based on CPU flags
a. Implement separated move functions for SSE/AVX/AVX2 to make full
utilization of cache bandwidth
b. Implement separated copy flow specifically optimized for target
architecture
2. Rewrite the memcpy functi
Main code changes:
1. Added more typical data points for a thorough performance test
2. Added unaligned test cases since it's common in DPDK usage
Signed-off-by: Zhihong Wang
---
app/test/test_memcpy_perf.c | 238 +---
1 file changed, 156 insertions(+),
Removed unnecessary test cases for base move functions since the function
"func_test" covers them all.
Signed-off-by: Zhihong Wang
---
app/test/test_memcpy.c | 52 +-
1 file changed, 1 insertion(+), 51 deletions(-)
diff --git a/app/test/test_memc
VTA is for debugging only, it increases compile time and binary size,
especially when there're a lot of inlines.
So disable it since memcpy test contains a lot of inline calls.
Signed-off-by: Zhihong Wang
---
app/test/Makefile | 6 ++
1 file changed, 6 insertions(+)
diff --git a/app/test/M
This patch set optimizes memcpy for DPDK for both SSE and AVX platforms.
It also extends memcpy test coverage with unaligned cases and more test points.
Optimization techniques are summarized below:
1. Utilize full cache bandwidth
2. Enforce aligned stores
3. Apply load address alignment based
6 matches
Mail list logo