Hi All, Here's an updated patch with the changes processed from the previous review.
I've bootstrapped and regtested on aarch64-none-linux-gnu and x86_64-pc-linux-gnu and no issues. Ok for trunk? Thanks, Tamar The 07/02/2019 11:20, Richard Biener wrote: > On Tue, 2 Jul 2019, Tamar Christina wrote: > > > Hi Richi, > > > > The 06/25/2019 10:02, Richard Biener wrote: > > > > > > This looks like a literal 1:1 translation plus merging with the > > > existing pattern around integers. You changed > > > (op:s@0 (convert@3 @1) (convert?@4 @2)) to > > > (op:s@0 (convert1?@3 @1) (convert2?@4 @2)) where this now also > > > matches if there are no inner conversions at all - was that a > > > desired change or did you merely want to catch when the first > > > operand is not a conversion but the second is, possibly only > > > for the RDIV_EXPR case? > > > > > > > Yes, the ? ? is for the RDIV case, I really only want (c a) `op` (c b), > > a `op` (c b) and (c a) `op` b. But didn't find a way to do this. > > One way would be to do > > (simplify > (convert (op:sc@0 (convert @1) (convert? @2))) > > but that doesn't work for RDIV. Using :C is tempting but you > do not get to know the original operand order which you of > course need. So I guess the way you do it is fine - you > could guard all of the code with a few types_match () checks > but I'm not sure it is worth the trouble. > > Richard. > > > The only thing I can think of that gets this is without overmatching is > > to either duplicate the code or move this back to a C helper function and > > call that from match.pd. But I was hoping to have it all in match.pd > > instead of hiding it away in a C call. > > > > Do you have a better way of doing it or a preference to an approach? > > > > > +(for op (plus minus mult rdiv) > > > + (simplify > > > + (convert (op:s@0 (convert1?@3 @1) (convert2?@4 @2))) > > > + (with { tree arg0 = strip_float_extensions (@1); > > > + tree arg1 = strip_float_extensions (@2); > > > + tree itype = TREE_TYPE (@0); > > > > > > you now unconditionally call strip_float_extensions on each operand > > > even for the integer case, please sink stuff only used in one > > > case arm. I guess keeping the integer case first via > > > > > > > Done, Initially didn't think it would be an issue since I don't use the > > value it > > creates in the integer case. But I re-ordered it. > > > > > should work (with the 'with' being in the ifs else position). > > > > > > + (if (code == REAL_TYPE) > > > + /* Ignore the conversion if we don't need to store intermediate > > > + results and neither type is a decimal float. */ > > > + (if (!(flag_float_store > > > + || DECIMAL_FLOAT_TYPE_P (type) > > > + || DECIMAL_FLOAT_TYPE_P (itype)) > > > + && types_match (ty1, ty2)) > > > + (convert (op (convert:ty1 @1) (convert:ty2 @2))))) > > > > > > this looks prone to the same recursion issue you described above. > > > > It's to break the recursion when you don't match anything. Indeed don't > > need it if > > I change the match condition above. > > Thanks, > > Tamar > > > > > > 'code' is used exactly once, using SCALAR_FLOAT_TYPE_P (itype) > > > in the above test would be clearer. Also both ifs can be combined. > > > The snipped above also doesn't appear in the convert.c code you > > > remove and the original one is > > > > > > switch (TREE_CODE (TREE_TYPE (expr))) > > > { > > > case REAL_TYPE: > > > /* Ignore the conversion if we don't need to store intermediate > > > results and neither type is a decimal float. */ > > > return build1_loc (loc, > > > (flag_float_store > > > || DECIMAL_FLOAT_TYPE_P (type) > > > || DECIMAL_FLOAT_TYPE_P (itype)) > > > ? CONVERT_EXPR : NOP_EXPR, type, expr); > > > > > > which as far as I can see doesn't do anything besides > > > exchanging CONVERT_EXPR for NOP_EXPR which is a noop to the IL. > > > So it appears this shouldn't be moved to match.pd at all? > > > It's also not a 1:1 move since you are changing 'expr'. > > > > > > Thanks, > > > Richard. > > > > > > > > Thanks, > > > > > Tamar > > > > > > > > > > Concretely it makes both these cases behave the same > > > > > > > > > > float e = (float)a * (float)b; > > > > > *c = (_Float16)e; > > > > > > > > > > and > > > > > > > > > > *c = (_Float16)((float)a * (float)b); > > > > > > > > > > Thanks, > > > > > Tamar > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > 2019-06-25 Tamar Christina <tamar.christ...@arm.com> > > > > > > > > > > * convert.c (convert_to_real_1): Move part of conversion code... > > > > > * match.pd: ...To here. > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > 2019-06-25 Tamar Christina <tamar.christ...@arm.com> > > > > > > > > > > * gcc.dg/type-convert-var.c: New test. > > > > > > > > > > -- > > > > > > > > > > -- > > > Richard Biener <rguent...@suse.de> > > > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; > > > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg) > > > > > > -- > Richard Biener <rguent...@suse.de> > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg) --
diff --git a/gcc/convert.c b/gcc/convert.c index a8f2bd049ba0cd83865ba0a5f7d74f9cdbad0d09..7f0d933f4d9e29719acb27eb1b32a9e540d93073 100644 --- a/gcc/convert.c +++ b/gcc/convert.c @@ -298,92 +298,6 @@ convert_to_real_1 (tree type, tree expr, bool fold_p) return build1 (TREE_CODE (expr), type, arg); } break; - /* Convert (outertype)((innertype0)a+(innertype1)b) - into ((newtype)a+(newtype)b) where newtype - is the widest mode from all of these. */ - case PLUS_EXPR: - case MINUS_EXPR: - case MULT_EXPR: - case RDIV_EXPR: - { - tree arg0 = strip_float_extensions (TREE_OPERAND (expr, 0)); - tree arg1 = strip_float_extensions (TREE_OPERAND (expr, 1)); - - if (FLOAT_TYPE_P (TREE_TYPE (arg0)) - && FLOAT_TYPE_P (TREE_TYPE (arg1)) - && DECIMAL_FLOAT_TYPE_P (itype) == DECIMAL_FLOAT_TYPE_P (type)) - { - tree newtype = type; - - if (TYPE_MODE (TREE_TYPE (arg0)) == SDmode - || TYPE_MODE (TREE_TYPE (arg1)) == SDmode - || TYPE_MODE (type) == SDmode) - newtype = dfloat32_type_node; - if (TYPE_MODE (TREE_TYPE (arg0)) == DDmode - || TYPE_MODE (TREE_TYPE (arg1)) == DDmode - || TYPE_MODE (type) == DDmode) - newtype = dfloat64_type_node; - if (TYPE_MODE (TREE_TYPE (arg0)) == TDmode - || TYPE_MODE (TREE_TYPE (arg1)) == TDmode - || TYPE_MODE (type) == TDmode) - newtype = dfloat128_type_node; - if (newtype == dfloat32_type_node - || newtype == dfloat64_type_node - || newtype == dfloat128_type_node) - { - expr = build2 (TREE_CODE (expr), newtype, - convert_to_real_1 (newtype, arg0, - fold_p), - convert_to_real_1 (newtype, arg1, - fold_p)); - if (newtype == type) - return expr; - break; - } - - if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (newtype)) - newtype = TREE_TYPE (arg0); - if (TYPE_PRECISION (TREE_TYPE (arg1)) > TYPE_PRECISION (newtype)) - newtype = TREE_TYPE (arg1); - /* Sometimes this transformation is safe (cannot - change results through affecting double rounding - cases) and sometimes it is not. If NEWTYPE is - wider than TYPE, e.g. (float)((long double)double - + (long double)double) converted to - (float)(double + double), the transformation is - unsafe regardless of the details of the types - involved; double rounding can arise if the result - of NEWTYPE arithmetic is a NEWTYPE value half way - between two representable TYPE values but the - exact value is sufficiently different (in the - right direction) for this difference to be - visible in ITYPE arithmetic. If NEWTYPE is the - same as TYPE, however, the transformation may be - safe depending on the types involved: it is safe - if the ITYPE has strictly more than twice as many - mantissa bits as TYPE, can represent infinities - and NaNs if the TYPE can, and has sufficient - exponent range for the product or ratio of two - values representable in the TYPE to be within the - range of normal values of ITYPE. */ - if (TYPE_PRECISION (newtype) < TYPE_PRECISION (itype) - && (flag_unsafe_math_optimizations - || (TYPE_PRECISION (newtype) == TYPE_PRECISION (type) - && real_can_shorten_arithmetic (TYPE_MODE (itype), - TYPE_MODE (type)) - && !excess_precision_type (newtype)))) - { - expr = build2 (TREE_CODE (expr), newtype, - convert_to_real_1 (newtype, arg0, - fold_p), - convert_to_real_1 (newtype, arg1, - fold_p)); - if (newtype == type) - return expr; - } - } - } - break; default: break; } diff --git a/gcc/match.pd b/gcc/match.pd index f8e35e96d22036bb0b96fbdbe2c7a346f4695067..66a1ad385ffff4456c87a8891ab78c437fefc64f 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -4937,37 +4937,106 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) the C/C++ front-ends by shorten_binary_op and shorten_compare. Long term we want to move all that code out of the front-ends into here. */ -/* If we have a narrowing conversion of an arithmetic operation where - both operands are widening conversions from the same type as the outer - narrowing conversion. Then convert the innermost operands to a suitable - unsigned type (to avoid introducing undefined behavior), perform the - operation and convert the result to the desired type. */ -(for op (plus minus) - (simplify - (convert (op:s (convert@2 @0) (convert?@3 @1))) - (if (INTEGRAL_TYPE_P (type) - /* We check for type compatibility between @0 and @1 below, - so there's no need to check that @1/@3 are integral types. */ - && INTEGRAL_TYPE_P (TREE_TYPE (@0)) - && INTEGRAL_TYPE_P (TREE_TYPE (@2)) - /* The precision of the type of each operand must match the - precision of the mode of each operand, similarly for the - result. */ - && type_has_mode_precision_p (TREE_TYPE (@0)) - && type_has_mode_precision_p (TREE_TYPE (@1)) - && type_has_mode_precision_p (type) - /* The inner conversion must be a widening conversion. */ - && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE (@0)) - && types_match (@0, type) - && (types_match (@0, @1) - /* Or the second operand is const integer or converted const - integer from valueize. */ - || TREE_CODE (@1) == INTEGER_CST)) - (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) - (op @0 (convert @1)) - (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); } - (convert (op (convert:utype @0) - (convert:utype @1)))))))) +/* Convert (outertype)((innertype0)a+(innertype1)b) + into ((newtype)a+(newtype)b) where newtype + is the widest mode from all of these. */ +(for op (plus minus mult rdiv) + (simplify + (convert (op:s@0 (convert1?@3 @1) (convert2?@4 @2))) + /* If we have a narrowing conversion of an arithmetic operation where + both operands are widening conversions from the same type as the outer + narrowing conversion. Then convert the innermost operands to a + suitable unsigned type (to avoid introducing undefined behavior), + perform the operation and convert the result to the desired type. */ + (if (INTEGRAL_TYPE_P (type) + && op != MULT_EXPR + && op != RDIV_EXPR + /* We check for type compatibility between @0 and @1 below, + so there's no need to check that @2/@4 are integral types. */ + && INTEGRAL_TYPE_P (TREE_TYPE (@1)) + && INTEGRAL_TYPE_P (TREE_TYPE (@3)) + /* The precision of the type of each operand must match the + precision of the mode of each operand, similarly for the + result. */ + && type_has_mode_precision_p (TREE_TYPE (@1)) + && type_has_mode_precision_p (TREE_TYPE (@2)) + && type_has_mode_precision_p (type) + /* The inner conversion must be a widening conversion. */ + && TYPE_PRECISION (TREE_TYPE (@3)) > TYPE_PRECISION (TREE_TYPE (@1)) + && types_match (@1, type) + && (types_match (@1, @2) + /* Or the second operand is const integer or converted const + integer from valueize. */ + || TREE_CODE (@2) == INTEGER_CST)) + (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1))) + (op @1 (convert @2)) + (with { tree utype = unsigned_type_for (TREE_TYPE (@1)); } + (convert (op (convert:utype @1) + (convert:utype @2))))) + (with { tree arg0 = strip_float_extensions (@1); + tree arg1 = strip_float_extensions (@2); + tree itype = TREE_TYPE (@0); + tree ty1 = TREE_TYPE (arg0); + tree ty2 = TREE_TYPE (arg1); + enum tree_code code = TREE_CODE (itype); } + (if (FLOAT_TYPE_P (ty1) + && FLOAT_TYPE_P (ty2) + && FLOAT_TYPE_P (type) + && DECIMAL_FLOAT_TYPE_P (itype) == DECIMAL_FLOAT_TYPE_P (type)) + (with { tree newtype = type; + if (TYPE_MODE (ty1) == SDmode + || TYPE_MODE (ty2) == SDmode + || TYPE_MODE (type) == SDmode) + newtype = dfloat32_type_node; + if (TYPE_MODE (ty1) == DDmode + || TYPE_MODE (ty2) == DDmode + || TYPE_MODE (type) == DDmode) + newtype = dfloat64_type_node; + if (TYPE_MODE (ty1) == TDmode + || TYPE_MODE (ty2) == TDmode + || TYPE_MODE (type) == TDmode) + newtype = dfloat128_type_node; } + (if ((newtype == dfloat32_type_node + || newtype == dfloat64_type_node + || newtype == dfloat128_type_node) + && newtype == type) + (convert:newtype (op (convert:newtype @1) (convert:newtype @2))) + (with { if (TYPE_PRECISION (ty1) > TYPE_PRECISION (newtype)) + newtype = ty1; + if (TYPE_PRECISION (ty2) > TYPE_PRECISION (newtype)) + newtype = ty2; } + /* Sometimes this transformation is safe (cannot + change results through affecting double rounding + cases) and sometimes it is not. If NEWTYPE is + wider than TYPE, e.g. (float)((long double)double + + (long double)double) converted to + (float)(double + double), the transformation is + unsafe regardless of the details of the types + involved; double rounding can arise if the result + of NEWTYPE arithmetic is a NEWTYPE value half way + between two representable TYPE values but the + exact value is sufficiently different (in the + right direction) for this difference to be + visible in ITYPE arithmetic. If NEWTYPE is the + same as TYPE, however, the transformation may be + safe depending on the types involved: it is safe + if the ITYPE has strictly more than twice as many + mantissa bits as TYPE, can represent infinities + and NaNs if the TYPE can, and has sufficient + exponent range for the product or ratio of two + values representable in the TYPE to be within the + range of normal values of ITYPE. */ + (if (TYPE_PRECISION (newtype) < TYPE_PRECISION (itype) + && (flag_unsafe_math_optimizations + || (TYPE_PRECISION (newtype) == TYPE_PRECISION (type) + && real_can_shorten_arithmetic (TYPE_MODE (itype), + TYPE_MODE (type)) + && !excess_precision_type (newtype)))) + (convert:newtype (op (convert:newtype @1) + (convert:newtype @2))) + )))) ) + ) +))) /* This is another case of narrowing, specifically when there's an outer BIT_AND_EXPR which masks off bits outside the type of the innermost diff --git a/gcc/testsuite/gcc.dg/type-convert-var.c b/gcc/testsuite/gcc.dg/type-convert-var.c new file mode 100644 index 0000000000000000000000000000000000000000..88d74e2a49d7123515b87ff64a18bd9b306d57e9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/type-convert-var.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O1 -fdump-tree-optimized" } */ +void foo (float a, float b, float *c) +{ + double e = (double)a * (double)b; + *c = (float)e; +} + +/* { dg-final { scan-tree-dump-not {double} "optimized" } } */