On Fri, Jul 14, 2023 at 5:16 PM Robin Dapp <rdapp....@gmail.com> wrote: > > >>> Can you add testcases? Also the current restriction is because > >>> the variants you add are not always correct and I don't see any > >>> checks that the intermediate type doesn't lose significant bits? > > I didn't manage to create one for aarch64 nor for x86 because AVX512 > has direct conversions e.g. for int64_t -> _Float16 and the new code > will not be triggered. Instead I added two separate RISC-V tests. > > The attached V2 always checks trapping_math when converting float > to integer and, like the NARROW_DST case, checks if the operand fits > the intermediate type when demoting from int to float. > > Would that be sufficient? > > riscv seems to be the only backend not (yet?) providing pack/unpack > expanders for the vect conversions and rather relying on extend/trunc > which seems a disadvantage now, particularly for the cases requiring > !flag_trapping_math with NONE but not for NARROW_DST. That might > be reason enough to implement pack/unpack in the backend. > > Nevertheless the patch might improve the status quo a bit? > > Regards > Robin > > > The recent changes that allowed multi-step conversions for > "non-packing/unpacking", i.e. modifier == NONE targets included > promoting to-float and demoting to-int variants. This patch > adds the missing demoting to-float and promoting to-int handling. > > gcc/ChangeLog: > > * tree-vect-stmts.cc (vectorizable_conversion): Handle > more demotion/promotion for modifier == NONE. > > gcc/testsuite/ChangeLog: > > * > gcc.target/riscv/rvv/autovec/conversions/vec-narrow-int64-float16.c: New test. > * gcc.target/riscv/rvv/autovec/conversions/vec-widen-float16-int64.c: > New test. > --- > .../conversions/vec-narrow-int64-float16.c | 12 ++++ > .../conversions/vec-widen-float16-int64.c | 12 ++++ > gcc/tree-vect-stmts.cc | 58 +++++++++++++++---- > 3 files changed, 71 insertions(+), 11 deletions(-) > create mode 100644 > gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-narrow-int64-float16.c > create mode 100644 > gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-widen-float16-int64.c > > diff --git > a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-narrow-int64-float16.c > > b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-narrow-int64-float16.c > new file mode 100644 > index 00000000000..ebee1cfa888 > --- /dev/null > +++ > b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-narrow-int64-float16.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-std=c99 -fno-vect-cost-model > -march=rv64gcv_zvfh -mabi=lp64d --param=riscv-autovec-preference=scalable" } > */ > + > +#include <stdint-gcc.h> > + > +void convert (_Float16 *restrict dst, int64_t *restrict a, int n) > +{ > + for (int i = 0; i < n; i++) > + dst[i] = (_Float16) (a[i] & 0x7fffffff); > +} > + > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ > diff --git > a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-widen-float16-int64.c > > b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-widen-float16-int64.c > new file mode 100644 > index 00000000000..eb0a17e99bc > --- /dev/null > +++ > b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-widen-float16-int64.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-std=c99 -fno-vect-cost-model > -march=rv64gcv_zvfh -mabi=lp64d --param=riscv-autovec-preference=scalable > -fno-trapping-math" } */ > + > +#include <stdint-gcc.h> > + > +void convert (int64_t *restrict dst, _Float16 *restrict a, int n) > +{ > + for (int i = 0; i < n; i++) > + dst[i] = (int64_t) a[i]; > +} > + > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > index c08d0ef951f..c78a750301d 100644 > --- a/gcc/tree-vect-stmts.cc > +++ b/gcc/tree-vect-stmts.cc > @@ -5192,29 +5192,65 @@ vectorizable_conversion (vec_info *vinfo, > break; > } > > - /* For conversions between float and smaller integer types try whether > we > - can use intermediate signed integer types to support the > + /* For conversions between float and integer types try whether > + we can use intermediate signed integer types to support the > conversion. */ > if ((code == FLOAT_EXPR > - && GET_MODE_SIZE (lhs_mode) > GET_MODE_SIZE (rhs_mode)) > + && GET_MODE_SIZE (lhs_mode) != GET_MODE_SIZE (rhs_mode))
this > || (code == FIX_TRUNC_EXPR > - && GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode) > - && !flag_trapping_math)) > + && (GET_MODE_SIZE (rhs_mode) != GET_MODE_SIZE (lhs_mode) and this check are now common between the FLOAT_EXPR and FIX_TRUNC_EXPR cases > + && !flag_trapping_math))) > { > + bool demotion = GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode); > bool float_expr_p = code == FLOAT_EXPR; > - scalar_mode imode = float_expr_p ? rhs_mode : lhs_mode; > - fltsz = GET_MODE_SIZE (float_expr_p ? lhs_mode : rhs_mode); > + unsigned short target_size; > + scalar_mode intermediate_mode; > + if (demotion) > + { > + intermediate_mode = lhs_mode; > + target_size = GET_MODE_SIZE (rhs_mode); > + } > + else > + { > + target_size = GET_MODE_SIZE (lhs_mode); > + tree itype > + = build_nonstandard_integer_type (GET_MODE_BITSIZE > + (rhs_mode), 0); > + intermediate_mode = SCALAR_TYPE_MODE (itype); you should be able to use int_mode_for_size? It might be that, for example _Float128 doesn't have a corresponding integer mode (like on 32bit archs without __int128), so we should check it exists. > + } > code1 = float_expr_p ? code : NOP_EXPR; > codecvt1 = float_expr_p ? NOP_EXPR : code; > - FOR_EACH_2XWIDER_MODE (rhs_mode_iter, imode) > + opt_scalar_mode mode_iter; > + FOR_EACH_2XWIDER_MODE (mode_iter, intermediate_mode) > { > - imode = rhs_mode_iter.require (); > - if (GET_MODE_SIZE (imode) > fltsz) > + intermediate_mode = mode_iter.require (); > + > + if (GET_MODE_SIZE (intermediate_mode) > target_size) > break; > > cvt_type > - = build_nonstandard_integer_type (GET_MODE_BITSIZE (imode), > + = build_nonstandard_integer_type (GET_MODE_BITSIZE > + (intermediate_mode), > 0); the 0); now fits on the previous line? Otherwise looks OK to me. Thanks, Richard. > + > + /* Check if the intermediate type can hold OP0's range. > + When converting from float to integer, this is not necessary > + because values that do not fit the (smaller) target type are > + unspecified anyway. */ > + if (demotion && float_expr_p) > + { > + wide_int op_min_value, op_max_value; > + if (!vect_get_range_info (op0, &op_min_value, > &op_max_value)) > + goto unsupported; > + > + if (cvt_type == NULL_TREE > + || (wi::min_precision (op_max_value, SIGNED) > + > TYPE_PRECISION (cvt_type)) > + || (wi::min_precision (op_min_value, SIGNED) > + > TYPE_PRECISION (cvt_type))) > + goto unsupported; > + } > + > cvt_type = get_vectype_for_scalar_type (vinfo, cvt_type, > slp_node); > /* This should only happened for SLP as long as loop vectorizer > -- > 2.41.0 > >