This version contains changes for removing unnecessary typecasting only. Backing out remaining changes i.e. loop-unrolling. Though loop-unrolling makes sense from more space/less time perspective, code generated by GCC 4.8.2 with "gcc -O3 -mavx -s" and "gcc -O3 -m64 -s" for loop of 2, 4 and 8 iterations are same and "memcpy perf" from "make test" reveals similar results for "with and without loop". Will investigate this later.
Ravi Kerur (1): Clean up rte_memcpy.h file .../common/include/arch/x86/rte_memcpy.h | 340 +++++++++++---------- 1 file changed, 175 insertions(+), 165 deletions(-) -- 1.9.1