On 03/05/11 10:07, Bernd Schmidt wrote:
On 04/15/2011 12:54 PM, Andrew Stubbs wrote:
Ping.

Trying to unblock this...

I think the point is that both examples:

long long foolong (long long x, short *a, short *b)
    {
        return x + (long long)*a * (long long)*b;
    }

and


   long long foolong (long long x, short *a, short *b)
   {
       return x + *a * *b;
   }

should produce the same code using the maddhidi pattern, and for that to
happen, simplify_rtx must produce a canonical form of the 16->64 bit
widening multiply. The form that already exists in arm.md looks better
so we should pick that.

I tried to fix it with the patch below, which unfortunately doesn't work
since during combine we don't see the SIGN_EXTEND operations inside the
MULT, but two shift operations instead. Maybe you can complete it from here?

This patch fixes my original testcase (the second one above), and so can be a complete replacement for that patch, if preferred? I did originally want to do it without exporting the function from combine.c (since nothing else appears to be exported like that), but it required moving too much code between files.

The patch does not fix the first test case above because combine does not recognise the testcase as a HI -> DI multiply-and-add. It can't do it because it has already combined the HI load with a HI->DI sign-extend, and there's neither a multiply that supports mem arguments, nor a multiply with DI mode inputs. I could add define_and_split patterns to catch these cases, but that seems the wrong way to do it ...

The real problem is that the widening-multiply tree optimization pass does not support multiplies that widen by more than one mode. As far as I can tell, ARM is the only backend that tries to do this, although cris has a comment to the effect that it would use it if it worked. I am investigating a way to fix this by introducing an optab matrix akin to the existing one for conversions, but that's another patch.

Is this patch OK? (Assuming my tests pass.)

Andrew
2011-05-18  Bernd Schmidt  <ber...@codesourcery.com>
	    Andrew Stubbs  <a...@codesourcery.com>

	gcc/
	* combine.c (make_compound_operation): Remove 'static'.
	* rtl.h (make_compound_operation): New prototype.
	* simplify-rtx.c (simplify_unary_operation_1): Create a new
	canonical form for widening multiplies.

--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -427,7 +427,6 @@ static const_rtx expand_field_assignment (const_rtx);
 static rtx make_extraction (enum machine_mode, rtx, HOST_WIDE_INT,
 			    rtx, unsigned HOST_WIDE_INT, int, int, int);
 static rtx extract_left_shift (rtx, int);
-static rtx make_compound_operation (rtx, enum rtx_code);
 static int get_pos_from_mask (unsigned HOST_WIDE_INT,
 			      unsigned HOST_WIDE_INT *);
 static rtx canon_reg_for_combine (rtx, rtx);
@@ -7516,7 +7515,7 @@ extract_left_shift (rtx x, int count)
    being kludges), it is MEM.  When processing the arguments of a comparison
    or a COMPARE against zero, it is COMPARE.  */
 
-static rtx
+rtx
 make_compound_operation (rtx x, enum rtx_code in_code)
 {
   enum rtx_code code = GET_CODE (x);
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -2372,6 +2372,7 @@ extern unsigned int extended_count (const_rtx, enum machine_mode, int);
 extern rtx remove_death (unsigned int, rtx);
 extern void dump_combine_stats (FILE *);
 extern void dump_combine_total_stats (FILE *);
+extern rtx make_compound_operation (rtx, enum rtx_code);
 
 /* In cfgcleanup.c  */
 extern void delete_dead_jumptables (void);
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -989,6 +989,8 @@ simplify_unary_operation_1 (enum rtx_code code, enum machine_mode mode, rtx op)
       break;
 
     case SIGN_EXTEND:
+      op = make_compound_operation (op, SET);
+
       /* (sign_extend (truncate (minus (label_ref L1) (label_ref L2))))
 	 becomes just the MINUS if its mode is MODE.  This allows
 	 folding switch statements on machines using casesi (such as
@@ -1000,6 +1002,23 @@ simplify_unary_operation_1 (enum rtx_code code, enum machine_mode mode, rtx op)
 	  && GET_CODE (XEXP (XEXP (op, 0), 1)) == LABEL_REF)
 	return XEXP (op, 0);
 
+      /* Extending a widening multiplication should be canonicalized to
+	 a wider widening multiplication.  */
+      if (GET_CODE (op) == MULT
+	  && GET_CODE (XEXP (op, 0)) == SIGN_EXTEND
+	  && GET_CODE (XEXP (op, 1)) == SIGN_EXTEND)
+	{
+	  rtx op0 = XEXP (XEXP (op, 0), 0);
+	  rtx op1 = XEXP (XEXP (op, 1), 0);
+	  enum machine_mode op0_mode = GET_MODE (op0);
+	  enum machine_mode op1_mode = GET_MODE (op1);
+	  return simplify_gen_binary (MULT, mode,
+				      simplify_gen_unary (SIGN_EXTEND, mode,
+							  op0, op0_mode),
+				      simplify_gen_unary (SIGN_EXTEND, mode,
+							  op1, op1_mode));
+	}
+
       /* Check for a sign extension of a subreg of a promoted
 	 variable, where the promotion is sign-extended, and the
 	 target mode is the same as the variable's promotion.  */
@@ -1071,6 +1090,23 @@ simplify_unary_operation_1 (enum rtx_code code, enum machine_mode mode, rtx op)
 	  && GET_MODE_SIZE (mode) <= GET_MODE_SIZE (GET_MODE (XEXP (op, 0))))
 	return rtl_hooks.gen_lowpart_no_emit (mode, op);
 
+      /* Extending a widening multiplication should be canonicalized to
+	 a wider widening multiplication.  */
+      if (GET_CODE (op) == MULT
+	  && GET_CODE (XEXP (op, 0)) == ZERO_EXTEND
+	  && GET_CODE (XEXP (op, 1)) == ZERO_EXTEND)
+	{
+	  rtx op0 = XEXP (XEXP (op, 0), 0);
+	  rtx op1 = XEXP (XEXP (op, 1), 0);
+	  enum machine_mode op0_mode = GET_MODE (op0);
+	  enum machine_mode op1_mode = GET_MODE (op1);
+	  return simplify_gen_binary (MULT, mode,
+				      simplify_gen_unary (ZERO_EXTEND, mode,
+							  op0, op0_mode),
+				      simplify_gen_unary (ZERO_EXTEND, mode,
+							  op1, op1_mode));
+	}
+
       /* (zero_extend:M (zero_extend:N <X>)) is (zero_extend:M <X>).  */
       if (GET_CODE (op) == ZERO_EXTEND)
 	return simplify_gen_unary (ZERO_EXTEND, mode, XEXP (op, 0),

Reply via email to