Issue 120453
Summary Insufficient Optimization of memcpy-Like Loop with Non-Null Pointers
Labels new issue
Assignees
Reporter jonathan-gruber-jg
    Given a `memcpy`-like loop whose pointer operands are known to be non-null, Clang does not optimize the loop into just an unconditional branch to `memcpy`.

Minimal test case:
```
#include <stddef.h>

void my_void_memcpy(void *restrict dst, const void *restrict src, size_t n) {
	if (!src || !dst) {
		unreachable();
	}

	/* At this point, we know that src and dst are both non-null. */

	for (size_t i = 0; i < n; ++i) {
		*((char *)dst + i) = *((char *)src + i);
	}
}
```

I only tested the target architectures x86_64, aarch64, and riscv64, but I would not be surprised if other targets have the same problem.

Host system: Arch Linux, x86_64.

Clang version: official Arch Linux package of clang, version 18.1.8-4.

Command line to reproduce results: clang -c test.c -O\<opt-level\>.

x86_64 assembly (Intel syntax), with -Os, -O2, or -O3:
```
my_void_memcpy:
	test	rdx, rdx
	jne	memcpy
	ret
```

aarch64 assembly, with -Os, -O2, or -O3:
```
my_void_memcpy:
	cbz	x2, .LBB0_2
	b	memcpy
.LBB0_2:
	ret
```

riscv64 assembly, with -Os, -O2, or -O3:
```
my_void_memcpy:
	beqz	a2, .LBB0_2
	tail	memcpy
.LBB0_2:
	ret
```

Each of the tested targets check if `n` is `0` and, if so, execute a return instruction; otherwise, they branch to `memcpy`. However, `memcpy` already checks if `n` is `0`, so, because both `src` and `dst` are non-null, `my_void_memcpy` could be further optimized into just an unconditional branch to `memcpy`.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to