Hi all,
This is V2 of a patch which adds new insns which optimise vector concatenations
when a narrowing truncation is performed on the resulting vector. This is for
integer as well as floating-point vectors.
The aforementioned operation usually results in codegen such as...
uzp1 v0.2d, v0.2d, v1.2d
xtn v0.2s, v0.2d
ret
... whereas the following would have sufficed without the need for XTN:
uzp1 v0.2s, v0.2s, v1.2s
ret
A more rigorous example is provided in the commit message. The main changes from
V1 -> V2 are the removal of incorrect modes for UZP1, and adding a test for each
mode affected by the new insns. Furthermore, support for floating-point is
added,
having accidentally been omitted from V1.
Best wishes,
Akram
---
Akram Ahmad (1):
aarch64: remove extra XTN in vector concatenation
gcc/config/aarch64/aarch64-simd.md | 32 +++++++++++++
gcc/config/aarch64/iterators.md | 11 +++++
.../aarch64/sve/truncated_concatenation_1.c | 46 +++++++++++++++++++
3 files changed, 89 insertions(+)
create mode 100644
gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c
--
2.34.1