On Mon, Aug 5, 2024 at 4:02 PM Juergen Christ <jchr...@linux.ibm.com> wrote: > > Am Mon, Aug 05, 2024 at 01:00:31PM +0200 schrieb Richard Biener: > > On Fri, Aug 2, 2024 at 2:43 PM Juergen Christ <jchr...@linux.ibm.com> wrote: > > > > > > Do not convert floats to ints in multiple step if trapping math is > > > enabled. This might hide some inexact signals. > > > > > > Also use correct sign (the sign of the target integer type) for the > > > intermediate steps. This only affects undefined behaviour (casting > > > floats to unsigned datatype where the float is negative). > > > > > > gcc/ChangeLog: > > > > > > * tree-vect-stmts.cc (vectorizable_conversion): multi-step > > > float to int conversion only with trapping math and correct > > > sign. > > > > > > Signed-off-by: Juergen Christ <jchr...@linux.ibm.com> > > > > > > Bootstrapped and tested on x84 and s390. Ok for trunk? > > > > > > --- > > > gcc/tree-vect-stmts.cc | 8 +++++--- > > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > > > index fdcda0d2abae..2ddd13383193 100644 > > > --- a/gcc/tree-vect-stmts.cc > > > +++ b/gcc/tree-vect-stmts.cc > > > @@ -5448,7 +5448,8 @@ vectorizable_conversion (vec_info *vinfo, > > > break; > > > > > > cvt_type > > > - = build_nonstandard_integer_type (GET_MODE_BITSIZE > > > (rhs_mode), 0); > > > + = build_nonstandard_integer_type (GET_MODE_BITSIZE (rhs_mode), > > > + TYPE_UNSIGNED (lhs_type)); > > > > But lhs_type should be a float type here, the idea that for a > > FLOAT_EXPR (int -> float) > > a signed integer type is the natural one to use - as it's 2x wider > > than the original > > RHS type it's signedness doesn't matter. Note all float types should be > > !TYPE_UNSIGNED so this hunk is a no-op but still less clear on the intent > > IMO. > > > > Please drop it. > > Will do. Sorry about that. > > > > cvt_type = get_same_sized_vectype (cvt_type, vectype_in); > > > if (cvt_type == NULL_TREE) > > > goto unsupported; > > > @@ -5505,10 +5506,11 @@ vectorizable_conversion (vec_info *vinfo, > > > if (GET_MODE_SIZE (lhs_mode) >= GET_MODE_SIZE (rhs_mode)) > > > goto unsupported; > > > > > > - if (code == FIX_TRUNC_EXPR) > > > + if (code == FIX_TRUNC_EXPR && !flag_trapping_math) > > > { > > > cvt_type > > > - = build_nonstandard_integer_type (GET_MODE_BITSIZE > > > (rhs_mode), 0); > > > + = build_nonstandard_integer_type (GET_MODE_BITSIZE (rhs_mode), > > > + TYPE_UNSIGNED (lhs_type)); > > > > Here it might be relevant for correctness - we have to choose between > > sfix and ufix for the float -> [u]int conversion. > > > > Do you have a testcase? Shouldn't the exactness be independent of the > > integer > > type we convert to? > > I was looking at this little program which contains undefined behaviour: > > #include <stdio.h> > > __attribute__((noinline,noclone,noipa)) > void > vec_pack_ufix_trunc_v2df (double *in, unsigned int *out); > > void > vec_pack_ufix_trunc_v2df (double *in, unsigned int *out) > { > out[0] = in[0]; > out[1] = in[1]; > out[2] = in[2]; > out[3] = in[3]; > } > > int main() > { > double in[] = {-1,-2,-3,-4}; > unsigned int out[4]; > > vec_pack_ufix_trunc_v2df (in, out); > for (int i = 0; i < 4; ++i) > printf("out[%d] = %u\n", i, out[i]); > return 0; > } > > On s390x, I get different results after vectorization: > > out[0] = 4294967295 > out[1] = 4294967294 > out[2] = 4294967293 > out[3] = 4294967292 > > than without vectorization: > > out[0] = 0 > out[1] = 0 > out[2] = 0 > out[3] = 0 > > Even if this is undefined behaviour, I think it would be nice to have > consistent results here. > > Also, while I added an expander to circumvent this problem in a > previous patch, reviewers requested to hide this behind trapping math. > Thus, I looked into this. > > Seeing the result from the CI for aarch64, I guess there are some > tests that actually expect this vectorization to always happen even > though it might not be save w.r.t. trapping math.
I do remember this was extensively discussed (but we might have missed something) and one argument indeed was that when it's undefined behavior we can do the vectorization given the actual values might be in-bound. Richard. > > > > > cvt_type = get_same_sized_vectype (cvt_type, vectype_in); > > > if (cvt_type == NULL_TREE) > > > goto unsupported; > > > -- > > > 2.43.5 > > >