Issue |
120453
|
Summary |
Insufficient Optimization of memcpy-Like Loop with Non-Null Pointers
|
Labels |
new issue
|
Assignees |
|
Reporter |
jonathan-gruber-jg
|
Given a `memcpy`-like loop whose pointer operands are known to be non-null, Clang does not optimize the loop into just an unconditional branch to `memcpy`.
Minimal test case:
```
#include <stddef.h>
void my_void_memcpy(void *restrict dst, const void *restrict src, size_t n) {
if (!src || !dst) {
unreachable();
}
/* At this point, we know that src and dst are both non-null. */
for (size_t i = 0; i < n; ++i) {
*((char *)dst + i) = *((char *)src + i);
}
}
```
I only tested the target architectures x86_64, aarch64, and riscv64, but I would not be surprised if other targets have the same problem.
Host system: Arch Linux, x86_64.
Clang version: official Arch Linux package of clang, version 18.1.8-4.
Command line to reproduce results: clang -c test.c -O\<opt-level\>.
x86_64 assembly (Intel syntax), with -Os, -O2, or -O3:
```
my_void_memcpy:
test rdx, rdx
jne memcpy
ret
```
aarch64 assembly, with -Os, -O2, or -O3:
```
my_void_memcpy:
cbz x2, .LBB0_2
b memcpy
.LBB0_2:
ret
```
riscv64 assembly, with -Os, -O2, or -O3:
```
my_void_memcpy:
beqz a2, .LBB0_2
tail memcpy
.LBB0_2:
ret
```
Each of the tested targets check if `n` is `0` and, if so, execute a return instruction; otherwise, they branch to `memcpy`. However, `memcpy` already checks if `n` is `0`, so, because both `src` and `dst` are non-null, `my_void_memcpy` could be further optimized into just an unconditional branch to `memcpy`.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs