On Thu, 23 Apr 2020 at 16:48, Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> This PR was caused by mismatched expectations between
> vectorizable_comparison and SLP.  We had a "<" comparison
> between two booleans that were leaves of the SLP tree, so
> vectorizable_comparison fell back on:
>
>   /* Invariant comparison.  */
>   if (!vectype)
>     {
>       vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1),
>                                              slp_node);
>       if (maybe_ne (TYPE_VECTOR_SUBPARTS (vectype), nunits))
>         return false;
>     }
>
> rhs1 and rhs2 were *unsigned* boolean types, so we got back a vector
> of unsigned integers.  This in itself was OK, and meant that "<"
> worked as expected without the need for the boolean fix-ups:
>
>   /* Boolean values may have another representation in vectors
>      and therefore we prefer bit operations over comparison for
>      them (which also works for scalar masks).  We store opcodes
>      to use in bitop1 and bitop2.  Statement is vectorized as
>        BITOP2 (rhs1 BITOP1 rhs2) or
>        rhs1 BITOP2 (BITOP1 rhs2)
>      depending on bitop1 and bitop2 arity.  */
>   bool swap_p = false;
>   if (VECTOR_BOOLEAN_TYPE_P (vectype))
>     {
>
> However, vectorizable_comparison then used vect_get_slp_defs to get
> the actual operands.  The request went to vect_get_constant_vectors,
> which also has logic to calculate the vector type.  The problem was
> that this type was different from the one chosen above:
>
>   if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (op))
>       && vect_mask_constant_operand_p (stmt_vinfo))
>     vector_type = truth_type_for (stmt_vectype);
>   else
>     vector_type = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op), 
> op_node);
>
> So the function gave back a vector of mask types, which here are vectors
> of *signed* booleans.  This meant that "<" gave:
>
>   true (-1) < false (0)
>
> and so the boolean fixup above was needed after all.
>
> Fixed by making vectorizable_comparison also pick a mask type in this case.
>
> Tested on aarch64-linux-gnu (with and without SVE) and x86_64-linux-gnu.
> Approved by Richard in the PR.
>
> Richard
>
>
> 2020-04-23  Richard Sandiford  <richard.sandif...@arm.com>
>
> gcc/
>         PR tree-optimization/94727
>         * tree-vect-stmts.c (vectorizable_comparison): Use mask_type when
>         comparing invariant scalar booleans.
>
> gcc/testsuite/
>         PR tree-optimization/94727
>         * gcc.dg/vect/pr94727.c: New test.

Hi Richard,

The new testcase fails at execution on arm (and s390 according to
gcc-testresults).

Can you check?

Thanks
Christophe


> ---
>  gcc/testsuite/gcc.dg/vect/pr94727.c | 24 ++++++++++++++++++++++++
>  gcc/tree-vect-stmts.c               |  7 +++++--
>  2 files changed, 29 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/pr94727.c
>
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 7f3a9fb5fb3..88a1e2c51d2 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -10566,8 +10566,11 @@ vectorizable_comparison (stmt_vec_info stmt_info, 
> gimple_stmt_iterator *gsi,
>    /* Invariant comparison.  */
>    if (!vectype)
>      {
> -      vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1),
> -                                            slp_node);
> +      if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (rhs1)))
> +       vectype = mask_type;
> +      else
> +       vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1),
> +                                              slp_node);
>        if (!vectype || maybe_ne (TYPE_VECTOR_SUBPARTS (vectype), nunits))
>         return false;
>      }
> diff --git a/gcc/testsuite/gcc.dg/vect/pr94727.c 
> b/gcc/testsuite/gcc.dg/vect/pr94727.c
> new file mode 100644
> index 00000000000..38408711345
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/pr94727.c
> @@ -0,0 +1,24 @@
> +/* { dg-additional-options "-O3" } */
> +
> +unsigned char a[16][32];
> +long b[16][32];
> +unsigned long c;
> +_Bool d;
> +
> +void __attribute__((noipa))
> +foo (void)
> +{
> +  for (int j = 0; j < 8; j++)
> +    for (int i = 0; i < 17; ++i)
> +      b[j][i] = (a[j][i] < c) > d;
> +}
> +
> +int
> +main (void)
> +{
> +  c = 1;
> +  foo ();
> +  if (!b[0][0])
> +    __builtin_abort ();
> +  return 0;
> +}

Reply via email to