Hi all,

This patch adds a new insn which optimises vector concatenations on SIMD/FP
registers when a narrowing truncation is performed on the resulting vector.
This usually results in codegen such as...

        uzp1    v0.2d, v0.2d, v1.2d
        xtn     v0.2s, v0.2d
        ret

... whereas the following would have sufficed without the need for XTN:

        uzp1    v0.2s, v0.2s, v1.2s
        ret

A more rigorous example is provided in the commit message. This is a
fairly straightforward patch, although I would appreciate some feedback
as to whether the scope of the modes covered by the insn is appropriate.
Similarly, I would also appreciate any suggestions for other test cases
that should be covered for this optimisation.

Many thanks,

Akram

---

Akram Ahmad (1):
  aarch64: remove extra XTN in vector concatenation

 gcc/config/aarch64/aarch64-simd.md            | 16 ++++++++++++++
 gcc/config/aarch64/iterators.md               | 12 ++++++++++
 .../aarch64/sve/truncated_concatenation_1.c   | 22 +++++++++++++++++++
 3 files changed, 50 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c

-- 
2.34.1

Reply via email to