On 11/22/2017 11:10 AM, Richard Sandiford wrote:
> Richard Sandiford <richard.sandif...@linaro.org> writes:
>> Two things stopped us using SLP reductions with variable-length vectors:
>>
>> (1) We didn't have a way of constructing the initial vector.
>>     This patch does it by creating a vector full of the neutral
>>     identity value and then using a shift-and-insert function
>>     to insert any non-identity inputs into the low-numbered elements.
>>     (The non-identity values are needed for double reductions.)
>>     Alternatively, for unchained MIN/MAX reductions that have no neutral
>>     value, we instead use the same duplicate-and-interleave approach as
>>     for SLP constant and external definitions (added by a previous
>>     patch).
>>
>> (2) The epilogue for constant-length vectors would extract the vector
>>     elements associated with each SLP statement and do scalar arithmetic
>>     on these individual elements.  For variable-length vectors, the patch
>>     instead creates a reduction vector for each SLP statement, replacing
>>     the elements for other SLP statements with the identity value.
>>     It then uses a hardware reduction instruction on each vector.
>>
>> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
>> and powerpc64le-linux-gnu.
> 
> Here's an updated version that applies on top of the recent
> removal of REDUC_*_EXPR.  Tested as before.
> 
> Thanks,
> Richard
> 
> 
> 2017-11-22  Richard Sandiford  <richard.sandif...@linaro.org>
>           Alan Hayward  <alan.hayw...@arm.com>
>           David Sherwood  <david.sherw...@arm.com>
> 
> gcc/
>       * doc/md.texi (vec_shl_insert_@var{m}): New optab.
>       * internal-fn.def (VEC_SHL_INSERT): New internal function.
>       * optabs.def (vec_shl_insert_optab): New optab.
>       * tree-vectorizer.h (can_duplicate_and_interleave_p): Declare.
>       (duplicate_and_interleave): Likewise.
>       * tree-vect-loop.c: Include internal-fn.h.
>       (neutral_op_for_slp_reduction): New function, split out from
>       get_initial_defs_for_reduction.
>       (get_initial_def_for_reduction): Handle option 2 for variable-length
>       vectors by loading the neutral value into a vector and then shifting
>       the initial value into element 0.
>       (get_initial_defs_for_reduction): Replace the code argument with
>       the neutral value calculated by neutral_op_for_slp_reduction.
>       Use gimple_build_vector for constant-length vectors.
>       Use IFN_VEC_SHL_INSERT for variable-length vectors if all
>       but the first group_size elements have a neutral value.
>       Use duplicate_and_interleave otherwise.
>       (vect_create_epilog_for_reduction): Take a neutral_op parameter.
>       Update call to get_initial_defs_for_reduction.  Handle SLP
>       reductions for variable-length vectors by creating one vector
>       result for each scalar result, with the elements associated
>       with other scalar results stubbed out with the neutral value.
>       (vectorizable_reduction): Call neutral_op_for_slp_reduction.
>       Require IFN_VEC_SHL_INSERT for double reductions on
>       variable-length vectors, or SLP reductions that have
>       a neutral value.  Require can_duplicate_and_interleave_p
>       support for variable-length unchained SLP reductions if there
>       is no neutral value, such as for MIN/MAX reductions.  Also require
>       the number of vector elements to be a multiple of the number of
>       SLP statements when doing variable-length unchained SLP reductions.
>       Update call to vect_create_epilog_for_reduction.
>       * tree-vect-slp.c (can_duplicate_and_interleave_p): Make public
>       and remove initial values.
>       (duplicate_and_interleave): Use IFN_VEC_SHL_INSERT for
>       variable-length vectors if all but the first group_size elements
>       have a neutral value.
>       * config/aarch64/aarch64.md (UNSPEC_INSR): New unspec.
>       * config/aarch64/aarch64-sve.md (vec_shl_insert_<mode>): New insn.
> 
> gcc/testsuite/
>       * gcc.dg/vect/pr37027.c: Remove XFAIL for variable-length vectors.
>       * gcc.dg/vect/pr67790.c: Likewise.
>       * gcc.dg/vect/slp-reduc-1.c: Likewise.
>       * gcc.dg/vect/slp-reduc-2.c: Likewise.
>       * gcc.dg/vect/slp-reduc-3.c: Likewise.
>       * gcc.dg/vect/slp-reduc-5.c: Likewise.
>       * gcc.target/aarch64/sve_slp_5.c: New test.
>       * gcc.target/aarch64/sve_slp_5_run.c: Likewise.
>       * gcc.target/aarch64/sve_slp_6.c: Likewise.
>       * gcc.target/aarch64/sve_slp_6_run.c: Likewise.
>       * gcc.target/aarch64/sve_slp_7.c: Likewise.
>       * gcc.target/aarch64/sve_slp_7_run.c: Likewise.
OK
jeff

Reply via email to