Jennifer Schmitz <jschm...@nvidia.com> writes:
> thank you for the feedback. I would like to summarize what I understand from 
> your suggestions before I start revising to make sure we are on the same page:
>
> 1. The new setup for constant folding of SVE intrinsics for binary operations 
> where both operands are constant vectors looks like this:
>
> In gcc/fold-const.cc:
> NEW: vector_const_binop: Handles vector part of const_binop element-wise
> const_binop: For vector arguments, calls vector_const_binop with const_binop 
> as callback
> poly_int_binop: Is now public and -if necessary- we can implement missing 
> codes (e.g. TRUNC_DIV_EXPR)

Yeah.  And specifically: I think we can move:

  if (TREE_CODE (arg1) == INTEGER_CST && TREE_CODE (arg2) == INTEGER_CST)
    {
      wide_int warg1 = wi::to_wide (arg1), res;
      wide_int warg2 = wi::to_wide (arg2, TYPE_PRECISION (type));
      if (!wide_int_binop (res, code, warg1, warg2, sign, &overflow))
        return NULL_TREE;
      poly_res = res;
    }

into poly_int_binop.  It shouldn't affect compile times on non-poly
targets too much, since poly_int_tree_p (arg1) just checks for
INTEGER_CST there.

> In aarch64 backend:
> NEW: aarch64_vector_const_binop: adapted from int_const_binop, but calls 
> poly_int_binop

Yes.  The main differences are that we shouldn't treat any operation
as overflowing, and that we can handle cases that are well-defined
for intrinsics but not for gimple.

> intrinsic_impl::fold: calls vector_const_binop with 
> aarch64_vector_const_binop as callback

Yeah.

> 2. Folding where only one operand is constant (0/x, x/0, 0*x etc.) can be 
> handled individually in intrinsic_impl, but in separate patches. If there is 
> already code to check for uniform vectors (e.g. in the svdiv->svasrd case), 
> we try to share code.

Yeah.  And in particular, we should try to handle (and test) vector-scalar
_n intrinsics as well as vector-vector intrinsics.

> Does that cover what you proposed? Otherwise, please feel free to correct any 
> misunderstandings.

SGTM.

Thanks,
Richard

Reply via email to