Jakub,
it is hard from all of the threads to actually distill what the real
issues are here. So let me start from a clean slate and state them simply.
Richi has three primary objections:
1) that we can do all of this with a templated version of double-int.
2) that we should not be passing in a precision and bitsize into the
interface.
3) that the interface is too large.
I have attached a fragment of my patch #5 to illustrate the main thrust
of my patches and to illustrate the usefulness to gcc right now.
In the current trunk, we have code that does simplification when the
mode fits in an HWI and we have code that does the simplification if the
mode fits in two HWIs. if the mode does not fit in two hwi's the code
does not do the simplification.
Thus here and in a large number of other places we have two copies of
the code. Richi wants there to be multiple template instantiations of
double-int. This means that we are now going to have to have 3 copies
of this code to support oi mode on a 64 bit host and 4 copies on a 32
bit host.
Further note that there are not as many cases for the 2*hwi in the code
as their are for the hwi case and in general this is true through out
the compiler. (CLRSB is missing from the 2hwi case in the patch) We
really did not write twice the code when we stated supporting 2 hwi, we
added about 1.5 times the code (simplify-rtx is better than most of the
rest of the compiler). I am using the rtl level as an example here
because i have posted all of those patches, but the tree level is no
better.
I do not want to write this code a third time and certainly not a fourth
time. Just fixing all of this is quite useful now: it fills in a lot
of gaps in our transformations and it removes many edge case crashes
because ti mode really is lightly tested. However, this patch becomes
crucial as the world gets larger.
Richi's second point is that we should be doing everything at "infinite
precision" and not passing in an explicit bitsize and precision. That
works ok (sans the issues i raised with it in tree-vpn earlier) when the
largest precision on the machine fits in a couple of hwis. However,
for targets that have large integers or cross compilers, this becomes
expensive. The idea behind my set of patches is that for the
transformations that can work this way, we do the math in the precision
of the type or mode. In general this means that almost all of the math
will be done quickly, even on targets that support really big
integers. For passes like tree-vrp, the math will be done at some
multiple of the largest type seen in the actual program. The amount
of the multiple is a function of the optimization, not the target or the
host. Currently (on my home computer) the wide-int interface allows the
optimization to go 4x the largest mode on the target.
I can get rid of this bound at the expense of doing an alloca rather
than stack allocating a fixed sized structure. However, given the
extremely heavy use of this interface, that does not seem like the best
of tradeoffs.
The truth is that the vast majority of the compiler actually wants to
see the math done the way that it is going to be done on the machine.
Tree-vrp and the gimple constant prop do not. But i have made
accommodations to handle both needs. I believe that the reason that
double-int was never used at the rtl level is that it does not actually
do the math in a way that is useful to the target.
Richi's third objection is that the interface is too large. I
disagree. It was designed based on the actual usage of the
interface. When i found places where i was writing the same code over
and over again, i put it in a function as part of the interface. I
later went back and optimized many of these because this is a very
heavily used interface. Richi has many other objections, but i have
agreed to fix almost all of them, so i am not going to address them here.
It really will be a huge burden to have to carry these patched until the
next revision. We are currently in stage 1 and i believe that the minor
issues that richi raises can be easily addressed.
kenny
@@ -1373,302 +1411,87 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode,
return CONST_DOUBLE_FROM_REAL_VALUE (d, mode);
}
- if (CONST_INT_P (op)
- && width <= HOST_BITS_PER_WIDE_INT && width > 0)
+ if (CONST_SCALAR_INT_P (op) && width > 0)
{
- HOST_WIDE_INT arg0 = INTVAL (op);
- HOST_WIDE_INT val;
+ wide_int result;
+ enum machine_mode imode = op_mode == VOIDmode ? mode : op_mode;
+ wide_int op0 = wide_int::from_rtx (op, imode);
+
+#if TARGET_SUPPORTS_WIDE_INT == 0
+ /* This assert keeps the simplification from producing a result
+ that cannot be represented in a CONST_DOUBLE but a lot of
+ upstream callers expect that this function never fails to
+ simplify something and so you if you added this to the test
+ above the code would die later anyway. If this assert
+ happens, you just need to make the port support wide int. */
+ gcc_assert (width <= HOST_BITS_PER_DOUBLE_INT);
+#endif
switch (code)
{
case NOT:
- val = ~ arg0;
+ result = ~op0;
break;
case NEG:
- val = - arg0;
+ result = op0.neg ();
break;
case ABS:
- val = (arg0 >= 0 ? arg0 : - arg0);
+ result = op0.abs ();
break;
case FFS:
- arg0 &= GET_MODE_MASK (mode);
- val = ffs_hwi (arg0);
+ result = op0.ffs ();
break;
case CLZ:
- arg0 &= GET_MODE_MASK (mode);
- if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val))
- ;
- else
- val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 1;
+ result = op0.clz (GET_MODE_BITSIZE (mode),
+ GET_MODE_PRECISION (mode));
break;
case CLRSB:
- arg0 &= GET_MODE_MASK (mode);
- if (arg0 == 0)
- val = GET_MODE_PRECISION (mode) - 1;
- else if (arg0 >= 0)
- val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 2;
- else if (arg0 < 0)
- val = GET_MODE_PRECISION (mode) - floor_log2 (~arg0) - 2;
+ result = op0.clrsb (GET_MODE_BITSIZE (mode),
+ GET_MODE_PRECISION (mode));
break;
-
+
case CTZ:
- arg0 &= GET_MODE_MASK (mode);
- if (arg0 == 0)
- {
- /* Even if the value at zero is undefined, we have to come
- up with some replacement. Seems good enough. */
- if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val))
- val = GET_MODE_PRECISION (mode);
- }
- else
- val = ctz_hwi (arg0);
+ result = op0.ctz (GET_MODE_BITSIZE (mode),
+ GET_MODE_PRECISION (mode));
break;
case POPCOUNT:
- arg0 &= GET_MODE_MASK (mode);
- val = 0;
- while (arg0)
- val++, arg0 &= arg0 - 1;
+ result = op0.popcount (GET_MODE_BITSIZE (mode),
+ GET_MODE_PRECISION (mode));
break;
case PARITY:
- arg0 &= GET_MODE_MASK (mode);
- val = 0;
- while (arg0)
- val++, arg0 &= arg0 - 1;
- val &= 1;
+ result = op0.parity (GET_MODE_BITSIZE (mode),
+ GET_MODE_PRECISION (mode));
break;
case BSWAP:
- {
- unsigned int s;
-
- val = 0;
- for (s = 0; s < width; s += 8)
- {
- unsigned int d = width - s - 8;
- unsigned HOST_WIDE_INT byte;
- byte = (arg0 >> s) & 0xff;
- val |= byte << d;
- }
- }
+ result = op0.bswap ();
break;
case TRUNCATE:
- val = arg0;
+ result = op0.sext (mode);
break;
case ZERO_EXTEND:
- /* When zero-extending a CONST_INT, we need to know its
- original mode. */
- gcc_assert (op_mode != VOIDmode);
- if (op_width == HOST_BITS_PER_WIDE_INT)
- {
- /* If we were really extending the mode,
- we would have to distinguish between zero-extension
- and sign-extension. */
- gcc_assert (width == op_width);
- val = arg0;
- }
- else if (GET_MODE_BITSIZE (op_mode) < HOST_BITS_PER_WIDE_INT)
- val = arg0 & GET_MODE_MASK (op_mode);
- else
- return 0;
+ result = op0.zext (mode);
break;
case SIGN_EXTEND:
- if (op_mode == VOIDmode)
- op_mode = mode;
- op_width = GET_MODE_PRECISION (op_mode);
- if (op_width == HOST_BITS_PER_WIDE_INT)
- {
- /* If we were really extending the mode,
- we would have to distinguish between zero-extension
- and sign-extension. */
- gcc_assert (width == op_width);
- val = arg0;
- }
- else if (op_width < HOST_BITS_PER_WIDE_INT)
- {
- val = arg0 & GET_MODE_MASK (op_mode);
- if (val_signbit_known_set_p (op_mode, val))
- val |= ~GET_MODE_MASK (op_mode);
- }
- else
- return 0;
+ result = op0.sext (mode);
break;
case SQRT:
- case FLOAT_EXTEND:
- case FLOAT_TRUNCATE:
- case SS_TRUNCATE:
- case US_TRUNCATE:
- case SS_NEG:
- case US_NEG:
- case SS_ABS:
- return 0;
-
- default:
- gcc_unreachable ();
- }
-
- return gen_int_mode (val, mode);
- }
-
- /* We can do some operations on integer CONST_DOUBLEs. Also allow
- for a DImode operation on a CONST_INT. */
- else if (width <= HOST_BITS_PER_DOUBLE_INT
- && (CONST_DOUBLE_AS_INT_P (op) || CONST_INT_P (op)))
- {
- double_int first, value;
-
- if (CONST_DOUBLE_AS_INT_P (op))
- first = double_int::from_pair (CONST_DOUBLE_HIGH (op),
- CONST_DOUBLE_LOW (op));
- else
- first = double_int::from_shwi (INTVAL (op));
-
- switch (code)
- {
- case NOT:
- value = ~first;
- break;
-
- case NEG:
- value = -first;
- break;
-
- case ABS:
- if (first.is_negative ())
- value = -first;
- else
- value = first;
- break;
-
- case FFS:
- value.high = 0;
- if (first.low != 0)
- value.low = ffs_hwi (first.low);
- else if (first.high != 0)
- value.low = HOST_BITS_PER_WIDE_INT + ffs_hwi (first.high);
- else
- value.low = 0;
- break;
-
- case CLZ:
- value.high = 0;
- if (first.high != 0)
- value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.high) - 1
- - HOST_BITS_PER_WIDE_INT;
- else if (first.low != 0)
- value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.low) - 1;
- else if (! CLZ_DEFINED_VALUE_AT_ZERO (mode, value.low))
- value.low = GET_MODE_PRECISION (mode);
- break;
-
- case CTZ:
- value.high = 0;
- if (first.low != 0)
- value.low = ctz_hwi (first.low);
- else if (first.high != 0)
- value.low = HOST_BITS_PER_WIDE_INT + ctz_hwi (first.high);
- else if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, value.low))
- value.low = GET_MODE_PRECISION (mode);
- break;
-
- case POPCOUNT:
- value = double_int_zero;
- while (first.low)
- {
- value.low++;
- first.low &= first.low - 1;
- }
- while (first.high)
- {
- value.low++;
- first.high &= first.high - 1;
- }
- break;
-
- case PARITY:
- value = double_int_zero;
- while (first.low)
- {
- value.low++;
- first.low &= first.low - 1;
- }
- while (first.high)
- {
- value.low++;
- first.high &= first.high - 1;
- }
- value.low &= 1;
- break;
-
- case BSWAP:
- {
- unsigned int s;
-
- value = double_int_zero;
- for (s = 0; s < width; s += 8)
- {
- unsigned int d = width - s - 8;
- unsigned HOST_WIDE_INT byte;
-
- if (s < HOST_BITS_PER_WIDE_INT)
- byte = (first.low >> s) & 0xff;
- else
- byte = (first.high >> (s - HOST_BITS_PER_WIDE_INT)) & 0xff;
-
- if (d < HOST_BITS_PER_WIDE_INT)
- value.low |= byte << d;
- else
- value.high |= byte << (d - HOST_BITS_PER_WIDE_INT);
- }
- }
- break;
-
- case TRUNCATE:
- /* This is just a change-of-mode, so do nothing. */
- value = first;
- break;
-
- case ZERO_EXTEND:
- gcc_assert (op_mode != VOIDmode);
-
- if (op_width > HOST_BITS_PER_WIDE_INT)
- return 0;
-
- value = double_int::from_uhwi (first.low & GET_MODE_MASK (op_mode));
- break;
-
- case SIGN_EXTEND:
- if (op_mode == VOIDmode
- || op_width > HOST_BITS_PER_WIDE_INT)
- return 0;
- else
- {
- value.low = first.low & GET_MODE_MASK (op_mode);
- if (val_signbit_known_set_p (op_mode, value.low))
- value.low |= ~GET_MODE_MASK (op_mode);
-
- value.high = HWI_SIGN_EXTEND (value.low);
- }
- break;
-
- case SQRT:
- return 0;
-
default:
return 0;
}
- return immed_double_int_const (value, mode);
+ return immed_wide_int_const (result, mode);
}