On 11/17/2017 07:59 AM, Richard Sandiford wrote:
> This patch removes the restriction that fully-masked loops cannot
> have reductions.  The key thing here is to make sure that the
> reduction accumulator doesn't include any values associated with
> inactive lanes; the patch adds a bunch of conditional binary
> operations for doing that.
> 
> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> and powerpc64le-linux-gnu.
> 
> Richard
> 
> 
> 2017-11-17  Richard Sandiford  <richard.sandif...@linaro.org>
>           Alan Hayward  <alan.hayw...@arm.com>
>           David Sherwood  <david.sherw...@arm.com>
> 
> gcc/
>       * doc/md.texi (cond_add@var{mode}, cond_sub@var{mode})
>       (cond_and@var{mode}, cond_ior@var{mode}, cond_xor@var{mode})
>       (cond_smin@var{mode}, cond_smax@var{mode}, cond_umin@var{mode})
>       (cond_umax@var{mode}): Document.
>       * optabs.def (cond_add_optab, cond_sub_optab, cond_and_optab)
>       (cond_ior_optab, cond_xor_optab, cond_smin_optab, cond_smax_optab)
>       (cond_umin_optab, cond_umax_optab): New optabs.
>       * internal-fn.def (COND_ADD, COND_SUB, COND_SMIN, COND_SMAX)
>       (COND_UMIN, COND_UMAX, COND_AND, COND_IOR, COND_XOR): New internal
>       functions.
>       * internal-fn.h (get_conditional_internal_fn): Declare.
>       * internal-fn.c (cond_binary_direct): New macro.
>       (expand_cond_binary_optab_fn): Likewise.
>       (direct_cond_binary_optab_supported_p): Likewise.
>       (get_conditional_internal_fn): New function.
>       * tree-vect-loop.c (vectorizable_reduction): Handle fully-masked loops.
>       Cope with reduction statements that are vectorized as calls rather
>       than assignments.
>       * config/aarch64/aarch64-sve.md (cond_<optab><mode>): New insns.
>       * config/aarch64/iterators.md (UNSPEC_COND_ADD, UNSPEC_COND_SUB)
>       (UNSPEC_COND_SMAX, UNSPEC_COND_UMAX, UNSPEC_COND_SMIN)
>       (UNSPEC_COND_UMIN, UNSPEC_COND_AND, UNSPEC_COND_ORR)
>       (UNSPEC_COND_EOR): New unspecs.
>       (optab): Add mappings for them.
>       (SVE_COND_INT_OP, SVE_COND_FP_OP): New int iterators.
>       (sve_int_op, sve_fp_op): New int attributes.
> 
> gcc/testsuite/
>       * gcc.dg/vect/pr60482.c: Remove XFAIL for variable-length vectors.
>       * gcc.target/aarch64/sve_reduc_1.c: Expect the loop operations
>       to be predicated.
>       * gcc.target/aarch64/sve_slp_5.c: Check for a fully-masked loop.
>       * gcc.target/aarch64/sve_slp_7.c: Likewise.
>       * gcc.target/aarch64/sve_reduc_5.c: New test.
>       * gcc.target/aarch64/sve_slp_13.c: Likewise.
>       * gcc.target/aarch64/sve_slp_13_run.c: Likewise.
I didn't walk through the aarch64 specific bits here.  The generic bits
are OK.

jeff

Reply via email to