Calls to rte_memcpy_aligned could result in unaligned loads/stores for 1 < n < 16. This is undefined behavior according to the C standard, and it gets flagged by the clang undefined behavior sanitizer.
rte_memcpy_aligned is called with aligned src and dst addresses. When n is odd, the code would copy a single byte first, increment src/dst, then, depending on the value of n, would cast src/dst to a qword, dword or word pointer. This results in an unaligned load/store. Reversing the order of the casts & copies (ie. copying a qword first, dword second, etc.) fixes the issue. Fixes: d35cc1fe6a7a ("eal/x86: revert select optimized memcpy at run-time") Cc: Xiaoyun Li <xiaoyun...@intel.com> Cc: sta...@dpdk.org Signed-off-by: Luc Pelletier <lucp.at.w...@gmail.com> --- lib/eal/x86/include/rte_memcpy.h | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/lib/eal/x86/include/rte_memcpy.h b/lib/eal/x86/include/rte_memcpy.h index 1b6c6e585f..a4eb1316b6 100644 --- a/lib/eal/x86/include/rte_memcpy.h +++ b/lib/eal/x86/include/rte_memcpy.h @@ -818,25 +818,25 @@ rte_memcpy_aligned(void *dst, const void *src, size_t n) { void *ret = dst; - /* Copy size <= 16 bytes */ + /* Copy size < 16 bytes */ if (n < 16) { - if (n & 0x01) { - *(uint8_t *)dst = *(const uint8_t *)src; - src = (const uint8_t *)src + 1; - dst = (uint8_t *)dst + 1; - } - if (n & 0x02) { - *(uint16_t *)dst = *(const uint16_t *)src; - src = (const uint16_t *)src + 1; - dst = (uint16_t *)dst + 1; + if (n & 0x08) { + *(uint64_t *)dst = *(const uint64_t *)src; + src = (const uint64_t *)src + 1; + dst = (uint64_t *)dst + 1; } if (n & 0x04) { *(uint32_t *)dst = *(const uint32_t *)src; src = (const uint32_t *)src + 1; dst = (uint32_t *)dst + 1; } - if (n & 0x08) - *(uint64_t *)dst = *(const uint64_t *)src; + if (n & 0x02) { + *(uint16_t *)dst = *(const uint16_t *)src; + src = (const uint16_t *)src + 1; + dst = (uint16_t *)dst + 1; + } + if (n & 0x01) + *(uint8_t *)dst = *(const uint8_t *)src; return ret; } -- 2.25.1