Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

Kenneth Zadeck Wed, 31 Oct 2012 11:20:12 -0700

Jakub,

it is hard from all of the threads to actually distill what the realissues are here. So let me start from a clean slate and state them simply.


Richi has three primary objections:

1) that we can do all of this with a templated version of double-int.

2) that we should not be passing in a precision and bitsize into theinterface.

3) that the interface is too large.

I have attached a fragment of my patch #5 to illustrate the main thrustof my patches and to illustrate the usefulness to gcc right now.

In the current trunk, we have code that does simplification when themode fits in an HWI and we have code that does the simplification if themode fits in two HWIs. if the mode does not fit in two hwi's the codedoes not do the simplification.

Thus here and in a large number of other places we have two copies ofthe code. Richi wants there to be multiple template instantiations ofdouble-int. This means that we are now going to have to have 3 copiesof this code to support oi mode on a 64 bit host and 4 copies on a 32bit host.

Further note that there are not as many cases for the 2*hwi in the codeas their are for the hwi case and in general this is true through outthe compiler. (CLRSB is missing from the 2hwi case in the patch) Wereally did not write twice the code when we stated supporting 2 hwi, weadded about 1.5 times the code (simplify-rtx is better than most of therest of the compiler). I am using the rtl level as an example herebecause i have posted all of those patches, but the tree level is nobetter.

I do not want to write this code a third time and certainly not a fourthtime. Just fixing all of this is quite useful now: it fills in a lotof gaps in our transformations and it removes many edge case crashesbecause ti mode really is lightly tested. However, this patch becomescrucial as the world gets larger.

Richi's second point is that we should be doing everything at "infiniteprecision" and not passing in an explicit bitsize and precision. Thatworks ok (sans the issues i raised with it in tree-vpn earlier) when thelargest precision on the machine fits in a couple of hwis. However,for targets that have large integers or cross compilers, this becomesexpensive. The idea behind my set of patches is that for thetransformations that can work this way, we do the math in the precisionof the type or mode. In general this means that almost all of the mathwill be done quickly, even on targets that support really bigintegers. For passes like tree-vrp, the math will be done at somemultiple of the largest type seen in the actual program. The amountof the multiple is a function of the optimization, not the target or thehost. Currently (on my home computer) the wide-int interface allows theoptimization to go 4x the largest mode on the target.

I can get rid of this bound at the expense of doing an alloca ratherthan stack allocating a fixed sized structure. However, given theextremely heavy use of this interface, that does not seem like the bestof tradeoffs.

The truth is that the vast majority of the compiler actually wants tosee the math done the way that it is going to be done on the machine.Tree-vrp and the gimple constant prop do not. But i have madeaccommodations to handle both needs. I believe that the reason thatdouble-int was never used at the rtl level is that it does not actuallydo the math in a way that is useful to the target.

Richi's third objection is that the interface is too large. Idisagree. It was designed based on the actual usage of theinterface. When i found places where i was writing the same code overand over again, i put it in a function as part of the interface. Ilater went back and optimized many of these because this is a veryheavily used interface. Richi has many other objections, but i haveagreed to fix almost all of them, so i am not going to address them here.

It really will be a huge burden to have to carry these patched until thenext revision. We are currently in stage 1 and i believe that the minorissues that richi raises can be easily addressed.


kenny

@@ -1373,302 +1411,87 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode,
       return CONST_DOUBLE_FROM_REAL_VALUE (d, mode);
     }
 
-  if (CONST_INT_P (op)
-      && width <= HOST_BITS_PER_WIDE_INT && width > 0)
+  if (CONST_SCALAR_INT_P (op) && width > 0)
     {
-      HOST_WIDE_INT arg0 = INTVAL (op);
-      HOST_WIDE_INT val;
+      wide_int result;
+      enum machine_mode imode = op_mode == VOIDmode ? mode : op_mode;
+      wide_int op0 = wide_int::from_rtx (op, imode);
+
+#if TARGET_SUPPORTS_WIDE_INT == 0
+      /* This assert keeps the simplification from producing a result
+	 that cannot be represented in a CONST_DOUBLE but a lot of
+	 upstream callers expect that this function never fails to
+	 simplify something and so you if you added this to the test
+	 above the code would die later anyway.  If this assert
+	 happens, you just need to make the port support wide int.  */
+      gcc_assert (width <= HOST_BITS_PER_DOUBLE_INT); 
+#endif
 
       switch (code)
 	{
 	case NOT:
-	  val = ~ arg0;
+	  result = ~op0;
 	  break;
 
 	case NEG:
-	  val = - arg0;
+	  result = op0.neg ();
 	  break;
 
 	case ABS:
-	  val = (arg0 >= 0 ? arg0 : - arg0);
+	  result = op0.abs ();
 	  break;
 
 	case FFS:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = ffs_hwi (arg0);
+	  result = op0.ffs ();
 	  break;
 
 	case CLZ:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val))
-	    ;
-	  else
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 1;
+	  result = op0.clz (GET_MODE_BITSIZE (mode), 
+			    GET_MODE_PRECISION (mode));
 	  break;
 
 	case CLRSB:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0)
-	    val = GET_MODE_PRECISION (mode) - 1;
-	  else if (arg0 >= 0)
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 2;
-	  else if (arg0 < 0)
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (~arg0) - 2;
+	  result = op0.clrsb (GET_MODE_BITSIZE (mode), 
+			      GET_MODE_PRECISION (mode));
 	  break;
-
+	  
 	case CTZ:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0)
-	    {
-	      /* Even if the value at zero is undefined, we have to come
-		 up with some replacement.  Seems good enough.  */
-	      if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val))
-		val = GET_MODE_PRECISION (mode);
-	    }
-	  else
-	    val = ctz_hwi (arg0);
+	  result = op0.ctz (GET_MODE_BITSIZE (mode), 
+			    GET_MODE_PRECISION (mode));
 	  break;
 
 	case POPCOUNT:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = 0;
-	  while (arg0)
-	    val++, arg0 &= arg0 - 1;
+	  result = op0.popcount (GET_MODE_BITSIZE (mode), 
+				 GET_MODE_PRECISION (mode));
 	  break;
 
 	case PARITY:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = 0;
-	  while (arg0)
-	    val++, arg0 &= arg0 - 1;
-	  val &= 1;
+	  result = op0.parity (GET_MODE_BITSIZE (mode), 
+			       GET_MODE_PRECISION (mode));
 	  break;
 
 	case BSWAP:
-	  {
-	    unsigned int s;
-
-	    val = 0;
-	    for (s = 0; s < width; s += 8)
-	      {
-		unsigned int d = width - s - 8;
-		unsigned HOST_WIDE_INT byte;
-		byte = (arg0 >> s) & 0xff;
-		val |= byte << d;
-	      }
-	  }
+	  result = op0.bswap ();
 	  break;
 
 	case TRUNCATE:
-	  val = arg0;
+	  result = op0.sext (mode);
 	  break;
 
 	case ZERO_EXTEND:
-	  /* When zero-extending a CONST_INT, we need to know its
-             original mode.  */
-	  gcc_assert (op_mode != VOIDmode);
-	  if (op_width == HOST_BITS_PER_WIDE_INT)
-	    {
-	      /* If we were really extending the mode,
-		 we would have to distinguish between zero-extension
-		 and sign-extension.  */
-	      gcc_assert (width == op_width);
-	      val = arg0;
-	    }
-	  else if (GET_MODE_BITSIZE (op_mode) < HOST_BITS_PER_WIDE_INT)
-	    val = arg0 & GET_MODE_MASK (op_mode);
-	  else
-	    return 0;
+	  result = op0.zext (mode);
 	  break;
 
 	case SIGN_EXTEND:
-	  if (op_mode == VOIDmode)
-	    op_mode = mode;
-	  op_width = GET_MODE_PRECISION (op_mode);
-	  if (op_width == HOST_BITS_PER_WIDE_INT)
-	    {
-	      /* If we were really extending the mode,
-		 we would have to distinguish between zero-extension
-		 and sign-extension.  */
-	      gcc_assert (width == op_width);
-	      val = arg0;
-	    }
-	  else if (op_width < HOST_BITS_PER_WIDE_INT)
-	    {
-	      val = arg0 & GET_MODE_MASK (op_mode);
-	      if (val_signbit_known_set_p (op_mode, val))
-		val |= ~GET_MODE_MASK (op_mode);
-	    }
-	  else
-	    return 0;
+	  result = op0.sext (mode);
 	  break;
 
 	case SQRT:
-	case FLOAT_EXTEND:
-	case FLOAT_TRUNCATE:
-	case SS_TRUNCATE:
-	case US_TRUNCATE:
-	case SS_NEG:
-	case US_NEG:
-	case SS_ABS:
-	  return 0;
-
-	default:
-	  gcc_unreachable ();
-	}
-
-      return gen_int_mode (val, mode);
-    }
-
-  /* We can do some operations on integer CONST_DOUBLEs.  Also allow
-     for a DImode operation on a CONST_INT.  */
-  else if (width <= HOST_BITS_PER_DOUBLE_INT
-	   && (CONST_DOUBLE_AS_INT_P (op) || CONST_INT_P (op)))
-    {
-      double_int first, value;
-
-      if (CONST_DOUBLE_AS_INT_P (op))
-	first = double_int::from_pair (CONST_DOUBLE_HIGH (op),
-				       CONST_DOUBLE_LOW (op));
-      else
-	first = double_int::from_shwi (INTVAL (op));
-
-      switch (code)
-	{
-	case NOT:
-	  value = ~first;
-	  break;
-
-	case NEG:
-	  value = -first;
-	  break;
-
-	case ABS:
-	  if (first.is_negative ())
-	    value = -first;
-	  else
-	    value = first;
-	  break;
-
-	case FFS:
-	  value.high = 0;
-	  if (first.low != 0)
-	    value.low = ffs_hwi (first.low);
-	  else if (first.high != 0)
-	    value.low = HOST_BITS_PER_WIDE_INT + ffs_hwi (first.high);
-	  else
-	    value.low = 0;
-	  break;
-
-	case CLZ:
-	  value.high = 0;
-	  if (first.high != 0)
-	    value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.high) - 1
-	              - HOST_BITS_PER_WIDE_INT;
-	  else if (first.low != 0)
-	    value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.low) - 1;
-	  else if (! CLZ_DEFINED_VALUE_AT_ZERO (mode, value.low))
-	    value.low = GET_MODE_PRECISION (mode);
-	  break;
-
-	case CTZ:
-	  value.high = 0;
-	  if (first.low != 0)
-	    value.low = ctz_hwi (first.low);
-	  else if (first.high != 0)
-	    value.low = HOST_BITS_PER_WIDE_INT + ctz_hwi (first.high);
-	  else if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, value.low))
-	    value.low = GET_MODE_PRECISION (mode);
-	  break;
-
-	case POPCOUNT:
-	  value = double_int_zero;
-	  while (first.low)
-	    {
-	      value.low++;
-	      first.low &= first.low - 1;
-	    }
-	  while (first.high)
-	    {
-	      value.low++;
-	      first.high &= first.high - 1;
-	    }
-	  break;
-
-	case PARITY:
-	  value = double_int_zero;
-	  while (first.low)
-	    {
-	      value.low++;
-	      first.low &= first.low - 1;
-	    }
-	  while (first.high)
-	    {
-	      value.low++;
-	      first.high &= first.high - 1;
-	    }
-	  value.low &= 1;
-	  break;
-
-	case BSWAP:
-	  {
-	    unsigned int s;
-
-	    value = double_int_zero;
-	    for (s = 0; s < width; s += 8)
-	      {
-		unsigned int d = width - s - 8;
-		unsigned HOST_WIDE_INT byte;
-
-		if (s < HOST_BITS_PER_WIDE_INT)
-		  byte = (first.low >> s) & 0xff;
-		else
-		  byte = (first.high >> (s - HOST_BITS_PER_WIDE_INT)) & 0xff;
-
-		if (d < HOST_BITS_PER_WIDE_INT)
-		  value.low |= byte << d;
-		else
-		  value.high |= byte << (d - HOST_BITS_PER_WIDE_INT);
-	      }
-	  }
-	  break;
-
-	case TRUNCATE:
-	  /* This is just a change-of-mode, so do nothing.  */
-	  value = first;
-	  break;
-
-	case ZERO_EXTEND:
-	  gcc_assert (op_mode != VOIDmode);
-
-	  if (op_width > HOST_BITS_PER_WIDE_INT)
-	    return 0;
-
-	  value = double_int::from_uhwi (first.low & GET_MODE_MASK (op_mode));
-	  break;
-
-	case SIGN_EXTEND:
-	  if (op_mode == VOIDmode
-	      || op_width > HOST_BITS_PER_WIDE_INT)
-	    return 0;
-	  else
-	    {
-	      value.low = first.low & GET_MODE_MASK (op_mode);
-	      if (val_signbit_known_set_p (op_mode, value.low))
-		value.low |= ~GET_MODE_MASK (op_mode);
-
-	      value.high = HWI_SIGN_EXTEND (value.low);
-	    }
-	  break;
-
-	case SQRT:
-	  return 0;
-
 	default:
 	  return 0;
 	}
 
-      return immed_double_int_const (value, mode);
+      return immed_wide_int_const (result, mode);
     }

Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

Reply via email to