Hi all,

The final patch in the series tackles the most complex of this family of 
patterns, UABAL2 and SABAL2.
These extract the high part of the sources, perform an absdiff on them, widen 
the result and accumulate.
The motivating testcase for this patch (series) is included and the 
simplification required doesn't actually
trigger with just the RTL pattern change because rtx_costs block it.
So this patch also extends rtx costs to recognise the (minus (smax (x, y) (smin 
(x, y)))) expression we use
to describe absdiff in the backend and avoid recursing into its arms.

This allows us to generate the single-instruction sequence expected here.
Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

        * config/aarch64/aarch64-simd.md (aarch64_<sur>abal2<mode>): Rename 
to...
        (aarch64_<su>abal2<mode>_insn): ... This.  Use RTL codes instead of 
unspec.
        (aarch64_<su>abal2<mode>): New define_expand.
        * config/aarch64/aarch64.cc (aarch64_abd_rtx_p): New function.
        (aarch64_rtx_costs): Handle ABD rtxes.
        * config/aarch64/aarch64.md (UNSPEC_SABAL2, UNSPEC_UABAL2): Delete.
        * config/aarch64/iterators.md (ABAL2): Delete.
        (sur): Remove handling of UNSPEC_UABAL2 and UNSPEC_SABAL2.

gcc/testsuite/ChangeLog:

        * gcc.target/aarch64/simd/vabal_combine.c: New test.

Attachment: abal2.patch
Description: abal2.patch

Reply via email to