Re: [PATCH 3/4][rtl]: simplify boolean vector EQ and NE comparisons

Richard Sandiford Fri, 06 Sep 2024 06:21:16 -0700

Tamar Christina <tamar.christ...@arm.com> writes:
> Hi All,
>
> This adds vector constant simplification for EQ and NE.  This is useful since
> the vectorizer generates a lot more vector compares now, in particular NE and 
> EQ
> and so these help us optimize cases where the values were not known at GIMPLE
> but instead only at RTL.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu, arm-none-linux-gnueabihf,
> x86_64-pc-linux-gnu -m32, -m64 and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>       * simplify-rtx.cc (simplify_context::simplify_unary_operation): Try
>       simplifying operand.
>       (simplify_const_relational_operation): Simplify vector EQ and NE.
>       (test_vector_int_const_compare): New.
>       (test_vector_int_const_compare_ops): New.
>       (simplify_rtx_cc_tests): Use them.
>
> ---
>
> diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
> index 
> a20a61c5dddbc80b23a9489d925a2c31b2163458..7e83e80246b70c81c388e77967f645d171efe983
>  100644
> --- a/gcc/simplify-rtx.cc
> +++ b/gcc/simplify-rtx.cc
> @@ -886,6 +886,10 @@ simplify_context::simplify_unary_operation (rtx_code 
> code, machine_mode mode,
>  
>    trueop = avoid_constant_pool_reference (op);
>  
> +  /* If the operand is not a reg or constant try simplifying it first.  */
> +  if (rtx tmp_op = simplify_rtx (op))
> +    op = tmp_op;
> +


We shouldn't need to do this.  The assumption is that the operands are
already simplified.

Which caller required this?

>    tem = simplify_const_unary_operation (code, mode, trueop, op_mode);
>    if (tem)
>      return tem;
> @@ -6354,6 +6358,35 @@ simplify_const_relational_operation (enum rtx_code 
> code,
>       return 0;
>      }
>  
> +  /* Check if the operands are a vector EQ or NE comparison.  */
> +  if (VECTOR_MODE_P (mode)
> +      && INTEGRAL_MODE_P (mode)
> +      && GET_CODE (op0) == CONST_VECTOR
> +      && GET_CODE (op1) == CONST_VECTOR
> +      && (code == EQ || code == NE))
> +    {
> +      if (rtx_equal_p (op0, op1))
> +     return code == EQ ? const_true_rtx : const0_rtx;
> +
> +      unsigned int npatterns0, npatterns1;
> +      if (CONST_VECTOR_NUNITS (op0).is_constant (&npatterns0)
> +       && CONST_VECTOR_NUNITS (op1).is_constant (&npatterns1))
> +     {
> +       if (npatterns0 != npatterns1)
> +         return code == EQ ? const0_rtx : const_true_rtx;

This looks like a typing error.  The operands have to have the same
number of elements.  But...

> +
> +       for (unsigned i = 0; i < npatterns0; i++)
> +         {
> +           rtx val0 = CONST_VECTOR_ELT (op0, i);
> +           rtx val1 = CONST_VECTOR_ELT (op1, i);
> +           if (!rtx_equal_p (val0, val1))
> +             return code == EQ ? const0_rtx : const_true_rtx;
> +         }
> +
> +       return code == EQ ? const_true_rtx : const0_rtx;
> +     }

...when is this loop needed?  For constant-sized vectors, isn't the
result always rtx_equal_p for EQ and !rtx_equal_p for NE?  If we have
equal vectors for which rtx_equal_p returns false then that should be
fixed.

For variable-sized vectors, I suppose the question is whether the
first unequal element is found in the minimum vector length, or whether
it only occurs for larger lengths.  In the former case we can fold at
compile time, but in the latter case we can't.

So we probably do want the loop for variable-length vectors, up to
constant_lower_bound (CONST_VECTOR_NUNITS (...)).

> +    }
> +
>    /* We can't simplify MODE_CC values since we don't know what the
>       actual comparison is.  */
>    if (GET_MODE_CLASS (GET_MODE (op0)) == MODE_CC)
> @@ -8820,6 +8853,55 @@ test_vector_ops ()
>      }
>  }
>  
> +/* Verify vector constant comparisons for EQ and NE.  */
> +
> +static void
> +test_vector_int_const_compare (machine_mode mode)
> +{
> +  rtx zeros = CONST0_RTX (mode);
> +  rtx minusone = CONSTM1_RTX (mode);
> +  rtx series_0_1 = gen_const_vec_series (mode, const0_rtx, const1_rtx);
> +  ASSERT_RTX_EQ (const0_rtx,
> +              simplify_const_relational_operation (EQ, mode, zeros,
> +                                                   CONST1_RTX (mode)));
> +  ASSERT_RTX_EQ (const_true_rtx,
> +              simplify_const_relational_operation (EQ, mode, zeros,
> +                                                   CONST0_RTX (mode)));
> +  ASSERT_RTX_EQ (const_true_rtx,
> +              simplify_const_relational_operation (EQ, mode, minusone,
> +                                                   CONSTM1_RTX (mode)));
> +  ASSERT_RTX_EQ (const_true_rtx,
> +              simplify_const_relational_operation (NE, mode, zeros,
> +                                                   CONST1_RTX (mode)));
> +  ASSERT_RTX_EQ (const_true_rtx,
> +              simplify_const_relational_operation (NE, mode, zeros,
> +                                                   series_0_1));
> +  ASSERT_RTX_EQ (const0_rtx,
> +              simplify_const_relational_operation (EQ, mode, zeros,
> +                                                   series_0_1));
> +}
> +
> +/* Verify some simplifications involving vectors integer comparisons.  */
> +
> +static void
> +test_vector_int_const_compare_ops ()
> +{
> +  for (unsigned int i = 0; i < NUM_MACHINE_MODES; ++i)
> +    {
> +      machine_mode mode = (machine_mode) i;
> +      if (VECTOR_MODE_P (mode)
> +       && INTEGRAL_MODE_P (mode)
> +       && GET_MODE_NUNITS (mode).is_constant ())
> +     {
> +       if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT
> +           && maybe_gt (GET_MODE_NUNITS (mode), 2))
> +         {
> +           test_vector_int_const_compare (mode);
> +         }
> +     }

If the above comments are right, we should be able to do this for
variable-sized vectors as well.

Thanks,
Richard

> +    }
> +}
> +
>  template<unsigned int N>
>  struct simplify_const_poly_int_tests
>  {
> @@ -8875,6 +8957,7 @@ simplify_rtx_cc_tests ()
>  {
>    test_scalar_ops ();
>    test_vector_ops ();
> +  test_vector_int_const_compare_ops ();
>    simplify_const_poly_int_tests<NUM_POLY_INT_COEFFS>::run ();
>  }

Re: [PATCH 3/4][rtl]: simplify boolean vector EQ and NE comparisons

Reply via email to