Hi all,

This patch reimplements the MD patterns for the 
UHADD,SHADD,UHSUB,SHSUB,URHADD,SRHADD instructions using
standard RTL operations rather than unspecs. The correct RTL representations 
involves widening
the inputs before adding them and halving, followed by a truncation back to the 
original mode.
An unfortunate wart in the patch is that we end up having very similar 
expanders for the intrinsics
through the aarch64_<su>h<ADDSUB:optab><mode> and aarch64_<su>rhadd<mode> names 
and the standard names
for the vector averaging optabs <su>avg<mode>3_floor and <su>avg<mode>3_ceil.
I'd like to reuse <su>avg<mode>3_ceil for the intrinsics builtin as well but 
our scheme
in aarch64-simd-builtins.def and aarch64-builtins.cc makes it awkward by only 
allowing mappings
of entries in aarch64-simd-builtins.def to:
   0 - CODE_FOR_aarch64_<name><mode>
   1-9 - CODE_FOR_<name><mode><1-9>
   10 - CODE_FOR_<name><mode>

whereas here we want a string after the <mode> i.e. CODE_FOR_uavg<mode>3_ceil.
This patch adds a bit of remapping logic in aarch64-builtins.cc before the 
construction of the
builtin info that remaps the CODE_FOR_* definitions in 
aarch64-simd-builtins.def to the
optab-derived ones. CODE_FOR_aarch64_srhaddv4si gets remapped to 
CODE_FOR_avgv4si3_ceil, for example.
It's a bit specific to this case, but this solution requires the least invasive 
changes while avoiding
having duplicate expanders just for the sake of a different pattern name.

Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

        * config/aarch64/aarch64-builtins.cc (VAR1): Move to after inclusion of
        aarch64-builtin-iterators.h.  Add definition to remap shadd, uhadd,
        srhadd, urhadd builtin codes for standard optab ones.
        * config/aarch64/aarch64-simd.md (<u>avg<mode>3_floor): Rename to...
        (<su_optab>avg<mode>3_floor): ... This.  Expand to RTL codes rather than
        unspec.
        (<u>avg<mode>3_ceil): Rename to...
        (<su_optab>avg<mode>3_ceil): ... This.  Expand to RTL codes rather than
        unspec.
        (aarch64_<su>hsub<mode>): New define_expand.
        (aarch64_<sur>h<addsub><mode><vczle><vczbe>): Split into...
        (*aarch64_<su>h<ADDSUB:optab><mode><vczle><vczbe>_insn): ... This...
        (*aarch64_<su>rhadd<mode><vczle><vczbe>_insn): ... And this.

Attachment: vrhadd.patch
Description: vrhadd.patch

Reply via email to