Hi all,

We can implement the halving-narrowing add/sub patterns with standard RTL codes 
as well rather than relying on unspecs.
This patch handles the low-part ones and the second patch does the high-part 
ones and removes the unspecs themselves.
The operation ADDHN on V4SI, for example, is represented as (truncate:V4HI 
((src1:V4SI + src2:V4SI) >> 16))
and RADDHN as (truncate:V4HI ((src1:V4SI + src2:V4SI + (1 << 15)) >> 16)).
Taking this opportunity I specified the patterns returning the narrow mode and 
annotated them with the
<vczle><vczbe> define_subst rules to get the vec_concat-zero meta-patterns too. 
This allows us to simplify
the expanders somewhat too. Tests are added to check that the combinations work.

Bootstrapped and tested on aarch64-none-linux-gnu. Also tested on 
aarch64_be-none-elf.
Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

        * config/aarch64/aarch64-simd.md 
(aarch64_<sur><addsub>hn<mode>_insn_le):
        Delete.
        (aarch64_<optab>hn<mode>_insn<vczle><vczbe>): New define_insn.
        (aarch64_<sur><addsub>hn<mode>_insn_be): Delete.
        (aarch64_r<optab>hn<mode>_insn<vczle><vczbe>): New define_insn.
        (aarch64_<sur><addsub>hn<mode>): Delete.
        (aarch64_<optab>hn<mode>): New define_expand.
        (aarch64_r<optab>hn<mode>): Likewise.
        * config/aarch64/predicates.md (aarch64_simd_raddsubhn_imm_vec):
        New predicate.

gcc/testsuite/ChangeLog:

        * gcc.target/aarch64/simd/pr99195_4.c: New test.

Attachment: addhn.patch
Description: addhn.patch

Reply via email to