Re: [GCC][middle-end] Add rules to strip away unneeded type casts in expressions (2nd patch)

Tamar Christina Tue, 02 Jul 2019 09:44:35 -0700

Hi All,

Here's an updated patch with the changes processed from the previous review.


I've bootstrapped and regtested on aarch64-none-linux-gnu and 
x86_64-pc-linux-gnu and no issues.

Ok for trunk?

Thanks,
Tamar

The 07/02/2019 11:20, Richard Biener wrote:
> On Tue, 2 Jul 2019, Tamar Christina wrote:
> 
> > Hi Richi,
> > 
> > The 06/25/2019 10:02, Richard Biener wrote:
> > > 
> > > This looks like a literal 1:1 translation plus merging with the
> > > existing pattern around integers.  You changed
> > > (op:s@0 (convert@3 @1) (convert?@4 @2)) to
> > > (op:s@0 (convert1?@3 @1) (convert2?@4 @2)) where this now also
> > > matches if there are no inner conversions at all - was that a
> > > desired change or did you merely want to catch when the first
> > > operand is not a conversion but the second is, possibly only
> > > for the RDIV_EXPR case?
> > >
> > 
> > Yes, the ? ? is for the RDIV case, I really only want (c a) `op` (c b),
> > a `op` (c b) and (c a) `op` b.  But didn't find a way to do this.
> 
> One way would be to do
> 
>  (simplify
>   (convert (op:sc@0 (convert @1) (convert? @2)))
> 
> but that doesn't work for RDIV.  Using :C is tempting but you
> do not get to know the original operand order which you of
> course need.  So I guess the way you do it is fine - you
> could guard all of the code with a few types_match () checks
> but I'm not sure it is worth the trouble.
> 
> Richard.
> 
> > The only thing I can think of that gets this is without overmatching is
> > to either duplicate the code or move this back to a C helper function and
> > call that from match.pd.  But I was hoping to have it all in match.pd
> > instead of hiding it away in a C call.
> > 
> > Do you have a better way of doing it or a preference to an approach?
> >  
> > > +(for op (plus minus mult rdiv)
> > > + (simplify
> > > +   (convert (op:s@0 (convert1?@3 @1) (convert2?@4 @2)))
> > > +   (with { tree arg0 = strip_float_extensions (@1);
> > > +          tree arg1 = strip_float_extensions (@2);
> > > +          tree itype = TREE_TYPE (@0);
> > > 
> > > you now unconditionally call strip_float_extensions on each operand
> > > even for the integer case, please sink stuff only used in one
> > > case arm.  I guess keeping the integer case first via
> > > 
> > 
> > Done, Initially didn't think it would be an issue since I don't use the 
> > value it
> > creates in the integer case. But I re-ordered it.
> >  
> > > should work (with the 'with' being in the ifs else position).
> > > 
> > > +      (if (code == REAL_TYPE)
> > > +       /* Ignore the conversion if we don't need to store intermediate
> > > +          results and neither type is a decimal float.  */
> > > +         (if (!(flag_float_store
> > > +              || DECIMAL_FLOAT_TYPE_P (type)
> > > +              || DECIMAL_FLOAT_TYPE_P (itype))
> > > +             && types_match (ty1, ty2))
> > > +           (convert (op (convert:ty1 @1) (convert:ty2 @2)))))
> > > 
> > > this looks prone to the same recursion issue you described above.
> > 
> > It's to break the recursion when you don't match anything. Indeed don't 
> > need it if
> > I change the match condition above.
> > Thanks,
> > Tamar
> > > 
> > > 'code' is used exactly once, using SCALAR_FLOAT_TYPE_P (itype)
> > > in the above test would be clearer.  Also both ifs can be combined.
> > > The snipped above also doesn't appear in the convert.c code you
> > > remove and the original one is
> > > 
> > >   switch (TREE_CODE (TREE_TYPE (expr)))
> > >     {
> > >     case REAL_TYPE:
> > >       /* Ignore the conversion if we don't need to store intermediate
> > >          results and neither type is a decimal float.  */
> > >       return build1_loc (loc,
> > >                          (flag_float_store
> > >                           || DECIMAL_FLOAT_TYPE_P (type)
> > >                           || DECIMAL_FLOAT_TYPE_P (itype))
> > >                          ? CONVERT_EXPR : NOP_EXPR, type, expr);
> > > 
> > > which as far as I can see doesn't do anything besides
> > > exchanging CONVERT_EXPR for NOP_EXPR which is a noop to the IL.
> > > So it appears this shouldn't be moved to match.pd at all?
> > > It's also not a 1:1 move since you are changing 'expr'.
> > > 
> > > Thanks,
> > > Richard.
> > > 
> > > > > Thanks,
> > > > > Tamar
> > > > > 
> > > > > Concretely it makes both these cases behave the same
> > > > > 
> > > > >   float e = (float)a * (float)b;
> > > > >   *c = (_Float16)e;
> > > > > 
> > > > > and
> > > > > 
> > > > >   *c = (_Float16)((float)a * (float)b);
> > > > > 
> > > > > Thanks,
> > > > > Tamar
> > > > > 
> > > > > gcc/ChangeLog:
> > > > > 
> > > > > 2019-06-25  Tamar Christina  <tamar.christ...@arm.com>
> > > > > 
> > > > >       * convert.c (convert_to_real_1): Move part of conversion code...
> > > > >       * match.pd: ...To here.
> > > > > 
> > > > > gcc/testsuite/ChangeLog:
> > > > > 
> > > > > 2019-06-25  Tamar Christina  <tamar.christ...@arm.com>
> > > > > 
> > > > >       * gcc.dg/type-convert-var.c: New test.
> > > > > 
> > > > > --
> > > > 
> > > 
> > > -- 
> > > Richard Biener <rguent...@suse.de>
> > > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
> > > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
> > 
> > 
> 
> -- 
> Richard Biener <rguent...@suse.de>
> SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
> GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

--

diff --git a/gcc/convert.c b/gcc/convert.c
index a8f2bd049ba0cd83865ba0a5f7d74f9cdbad0d09..7f0d933f4d9e29719acb27eb1b32a9e540d93073 100644
--- a/gcc/convert.c
+++ b/gcc/convert.c
@@ -298,92 +298,6 @@ convert_to_real_1 (tree type, tree expr, bool fold_p)
 	      return build1 (TREE_CODE (expr), type, arg);
 	    }
 	  break;
-	/* Convert (outertype)((innertype0)a+(innertype1)b)
-	   into ((newtype)a+(newtype)b) where newtype
-	   is the widest mode from all of these.  */
-	case PLUS_EXPR:
-	case MINUS_EXPR:
-	case MULT_EXPR:
-	case RDIV_EXPR:
-	   {
-	     tree arg0 = strip_float_extensions (TREE_OPERAND (expr, 0));
-	     tree arg1 = strip_float_extensions (TREE_OPERAND (expr, 1));
-
-	     if (FLOAT_TYPE_P (TREE_TYPE (arg0))
-		 && FLOAT_TYPE_P (TREE_TYPE (arg1))
-		 && DECIMAL_FLOAT_TYPE_P (itype) == DECIMAL_FLOAT_TYPE_P (type))
-	       {
-		  tree newtype = type;
-
-		  if (TYPE_MODE (TREE_TYPE (arg0)) == SDmode
-		      || TYPE_MODE (TREE_TYPE (arg1)) == SDmode
-		      || TYPE_MODE (type) == SDmode)
-		    newtype = dfloat32_type_node;
-		  if (TYPE_MODE (TREE_TYPE (arg0)) == DDmode
-		      || TYPE_MODE (TREE_TYPE (arg1)) == DDmode
-		      || TYPE_MODE (type) == DDmode)
-		    newtype = dfloat64_type_node;
-		  if (TYPE_MODE (TREE_TYPE (arg0)) == TDmode
-		      || TYPE_MODE (TREE_TYPE (arg1)) == TDmode
-		      || TYPE_MODE (type) == TDmode)
-                    newtype = dfloat128_type_node;
-		  if (newtype == dfloat32_type_node
-		      || newtype == dfloat64_type_node
-		      || newtype == dfloat128_type_node)
-		    {
-		      expr = build2 (TREE_CODE (expr), newtype,
-				     convert_to_real_1 (newtype, arg0,
-							fold_p),
-				     convert_to_real_1 (newtype, arg1,
-							fold_p));
-		      if (newtype == type)
-			return expr;
-		      break;
-		    }
-
-		  if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (newtype))
-		    newtype = TREE_TYPE (arg0);
-		  if (TYPE_PRECISION (TREE_TYPE (arg1)) > TYPE_PRECISION (newtype))
-		    newtype = TREE_TYPE (arg1);
-		  /* Sometimes this transformation is safe (cannot
-		     change results through affecting double rounding
-		     cases) and sometimes it is not.  If NEWTYPE is
-		     wider than TYPE, e.g. (float)((long double)double
-		     + (long double)double) converted to
-		     (float)(double + double), the transformation is
-		     unsafe regardless of the details of the types
-		     involved; double rounding can arise if the result
-		     of NEWTYPE arithmetic is a NEWTYPE value half way
-		     between two representable TYPE values but the
-		     exact value is sufficiently different (in the
-		     right direction) for this difference to be
-		     visible in ITYPE arithmetic.  If NEWTYPE is the
-		     same as TYPE, however, the transformation may be
-		     safe depending on the types involved: it is safe
-		     if the ITYPE has strictly more than twice as many
-		     mantissa bits as TYPE, can represent infinities
-		     and NaNs if the TYPE can, and has sufficient
-		     exponent range for the product or ratio of two
-		     values representable in the TYPE to be within the
-		     range of normal values of ITYPE.  */
-		  if (TYPE_PRECISION (newtype) < TYPE_PRECISION (itype)
-		      && (flag_unsafe_math_optimizations
-			  || (TYPE_PRECISION (newtype) == TYPE_PRECISION (type)
-			      && real_can_shorten_arithmetic (TYPE_MODE (itype),
-							      TYPE_MODE (type))
-			      && !excess_precision_type (newtype))))
-		    {
-		      expr = build2 (TREE_CODE (expr), newtype,
-				     convert_to_real_1 (newtype, arg0,
-							fold_p),
-				     convert_to_real_1 (newtype, arg1,
-							fold_p));
-		      if (newtype == type)
-			return expr;
-		    }
-	       }
-	   }
-	  break;
 	default:
 	  break;
       }
diff --git a/gcc/match.pd b/gcc/match.pd
index f8e35e96d22036bb0b96fbdbe2c7a346f4695067..66a1ad385ffff4456c87a8891ab78c437fefc64f 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4937,37 +4937,106 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
    the C/C++ front-ends by shorten_binary_op and shorten_compare.  Long
    term we want to move all that code out of the front-ends into here.  */
 
-/* If we have a narrowing conversion of an arithmetic operation where
-   both operands are widening conversions from the same type as the outer
-   narrowing conversion.  Then convert the innermost operands to a suitable
-   unsigned type (to avoid introducing undefined behavior), perform the
-   operation and convert the result to the desired type.  */
-(for op (plus minus)
-  (simplify
-    (convert (op:s (convert@2 @0) (convert?@3 @1)))
-    (if (INTEGRAL_TYPE_P (type)
-	 /* We check for type compatibility between @0 and @1 below,
-	    so there's no need to check that @1/@3 are integral types.  */
-	 && INTEGRAL_TYPE_P (TREE_TYPE (@0))
-	 && INTEGRAL_TYPE_P (TREE_TYPE (@2))
-	 /* The precision of the type of each operand must match the
-	    precision of the mode of each operand, similarly for the
-	    result.  */
-	 && type_has_mode_precision_p (TREE_TYPE (@0))
-	 && type_has_mode_precision_p (TREE_TYPE (@1))
-	 && type_has_mode_precision_p (type)
-	 /* The inner conversion must be a widening conversion.  */
-	 && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE (@0))
-	 && types_match (@0, type)
-	 && (types_match (@0, @1)
-	     /* Or the second operand is const integer or converted const
-		integer from valueize.  */
-	     || TREE_CODE (@1) == INTEGER_CST))
-      (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
-	(op @0 (convert @1))
-	(with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
-	 (convert (op (convert:utype @0)
-		      (convert:utype @1))))))))
+/* Convert (outertype)((innertype0)a+(innertype1)b)
+   into ((newtype)a+(newtype)b) where newtype
+   is the widest mode from all of these.  */
+(for op (plus minus mult rdiv)
+ (simplify
+   (convert (op:s@0 (convert1?@3 @1) (convert2?@4 @2)))
+   /* If we have a narrowing conversion of an arithmetic operation where
+      both operands are widening conversions from the same type as the outer
+      narrowing conversion.  Then convert the innermost operands to a
+      suitable unsigned type (to avoid introducing undefined behavior),
+      perform the operation and convert the result to the desired type.  */
+   (if (INTEGRAL_TYPE_P (type)
+	&& op != MULT_EXPR
+	&& op != RDIV_EXPR
+	/* We check for type compatibility between @0 and @1 below,
+	   so there's no need to check that @2/@4 are integral types.  */
+	&& INTEGRAL_TYPE_P (TREE_TYPE (@1))
+	&& INTEGRAL_TYPE_P (TREE_TYPE (@3))
+	/* The precision of the type of each operand must match the
+	   precision of the mode of each operand, similarly for the
+	   result.  */
+	&& type_has_mode_precision_p (TREE_TYPE (@1))
+	&& type_has_mode_precision_p (TREE_TYPE (@2))
+	&& type_has_mode_precision_p (type)
+	/* The inner conversion must be a widening conversion.  */
+	&& TYPE_PRECISION (TREE_TYPE (@3)) > TYPE_PRECISION (TREE_TYPE (@1))
+	&& types_match (@1, type)
+	&& (types_match (@1, @2)
+	    /* Or the second operand is const integer or converted const
+	       integer from valueize.  */
+	    || TREE_CODE (@2) == INTEGER_CST))
+     (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
+       (op @1 (convert @2))
+       (with { tree utype = unsigned_type_for (TREE_TYPE (@1)); }
+       (convert (op (convert:utype @1)
+		    (convert:utype @2)))))
+      (with { tree arg0 = strip_float_extensions (@1);
+	      tree arg1 = strip_float_extensions (@2);
+	      tree itype = TREE_TYPE (@0);
+	      tree ty1 = TREE_TYPE (arg0);
+	      tree ty2 = TREE_TYPE (arg1);
+	      enum tree_code code = TREE_CODE (itype); }
+	(if (FLOAT_TYPE_P (ty1)
+	     && FLOAT_TYPE_P (ty2)
+	     && FLOAT_TYPE_P (type)
+	     && DECIMAL_FLOAT_TYPE_P (itype) == DECIMAL_FLOAT_TYPE_P (type))
+	 (with { tree newtype = type;
+		 if (TYPE_MODE (ty1) == SDmode
+		     || TYPE_MODE (ty2) == SDmode
+		     || TYPE_MODE (type) == SDmode)
+		   newtype = dfloat32_type_node;
+		 if (TYPE_MODE (ty1) == DDmode
+		     || TYPE_MODE (ty2) == DDmode
+		     || TYPE_MODE (type) == DDmode)
+		   newtype = dfloat64_type_node;
+		 if (TYPE_MODE (ty1) == TDmode
+		     || TYPE_MODE (ty2) == TDmode
+		     || TYPE_MODE (type) == TDmode)
+		   newtype = dfloat128_type_node; }
+	  (if ((newtype == dfloat32_type_node
+		|| newtype == dfloat64_type_node
+		|| newtype == dfloat128_type_node)
+	      && newtype == type)
+	    (convert:newtype (op (convert:newtype @1) (convert:newtype @2)))
+	    (with { if (TYPE_PRECISION (ty1) > TYPE_PRECISION (newtype))
+		      newtype = ty1;
+		    if (TYPE_PRECISION (ty2) > TYPE_PRECISION (newtype))
+		      newtype = ty2; }
+	       /* Sometimes this transformation is safe (cannot
+		  change results through affecting double rounding
+		  cases) and sometimes it is not.  If NEWTYPE is
+		  wider than TYPE, e.g. (float)((long double)double
+		  + (long double)double) converted to
+		  (float)(double + double), the transformation is
+		  unsafe regardless of the details of the types
+		  involved; double rounding can arise if the result
+		  of NEWTYPE arithmetic is a NEWTYPE value half way
+		  between two representable TYPE values but the
+		  exact value is sufficiently different (in the
+		  right direction) for this difference to be
+		  visible in ITYPE arithmetic.  If NEWTYPE is the
+		  same as TYPE, however, the transformation may be
+		  safe depending on the types involved: it is safe
+		  if the ITYPE has strictly more than twice as many
+		  mantissa bits as TYPE, can represent infinities
+		  and NaNs if the TYPE can, and has sufficient
+		  exponent range for the product or ratio of two
+		  values representable in the TYPE to be within the
+		  range of normal values of ITYPE.  */
+	      (if (TYPE_PRECISION (newtype) < TYPE_PRECISION (itype)
+		   && (flag_unsafe_math_optimizations
+		       || (TYPE_PRECISION (newtype) == TYPE_PRECISION (type)
+			   && real_can_shorten_arithmetic (TYPE_MODE (itype),
+							   TYPE_MODE (type))
+			   && !excess_precision_type (newtype))))
+		 (convert:newtype (op (convert:newtype @1)
+				      (convert:newtype @2)))
+	 )))) )
+   )
+)))
 
 /* This is another case of narrowing, specifically when there's an outer
    BIT_AND_EXPR which masks off bits outside the type of the innermost
diff --git a/gcc/testsuite/gcc.dg/type-convert-var.c b/gcc/testsuite/gcc.dg/type-convert-var.c
new file mode 100644
index 0000000000000000000000000000000000000000..88d74e2a49d7123515b87ff64a18bd9b306d57e9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/type-convert-var.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O1 -fdump-tree-optimized" } */
+void foo (float a, float b, float *c)
+{
+  double e = (double)a * (double)b;
+  *c = (float)e;
+}
+
+/* { dg-final { scan-tree-dump-not {double} "optimized" } } */

Re: [GCC][middle-end] Add rules to strip away unneeded type casts in expressions (2nd patch)

Reply via email to