Hi all, This patch adds a new insn which optimises vector concatenations on SIMD/FP registers when a narrowing truncation is performed on the resulting vector. This usually results in codegen such as...
uzp1 v0.2d, v0.2d, v1.2d xtn v0.2s, v0.2d ret ... whereas the following would have sufficed without the need for XTN: uzp1 v0.2s, v0.2s, v1.2s ret A more rigorous example is provided in the commit message. This is a fairly straightforward patch, although I would appreciate some feedback as to whether the scope of the modes covered by the insn is appropriate. Similarly, I would also appreciate any suggestions for other test cases that should be covered for this optimisation. Many thanks, Akram --- Akram Ahmad (1): aarch64: remove extra XTN in vector concatenation gcc/config/aarch64/aarch64-simd.md | 16 ++++++++++++++ gcc/config/aarch64/iterators.md | 12 ++++++++++ .../aarch64/sve/truncated_concatenation_1.c | 22 +++++++++++++++++++ 3 files changed, 50 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c -- 2.34.1