On 21/05/14 17:05, Jakub Jelinek wrote: > On Wed, May 21, 2014 at 12:53:47PM +1000, Kugan wrote: >> On 20/05/14 16:52, Jakub Jelinek wrote: >>> On Tue, May 20, 2014 at 12:27:31PM +1000, Kugan wrote: >>>> 1. Handling NOP_EXPR or CONVERT_EXPR that are in the IL because they >>>> are required for type correctness. We have two cases here: >>>> >>>> A) Mode is smaller than word_mode. This is usually from where the >>>> zero/sign extensions are showing up in final assembly. >>>> For example : >>>> int = (int) short >>>> which usually expands to >>>> (set (reg:SI ) >>>> (sext:SI (subreg:HI (reg:SI )))) >>>> We can expand this >>>> (set (reg:SI ) (((reg:SI )))) >>>> >>>> If following is true: >>>> 1. Value stored in RHS and LHS are of the same signedness >>>> 2. Type can hold the value. i.e., In cases like char = (char) short, we >>>> check that the value in short is representable char type. (i.e. look at >>>> the value range in RHS SSA_NAME and see if that can be represented in >>>> types of LHS without overflowing) >>>> >>>> Subreg here is not a paradoxical subreg. We are removing the subreg and >>>> zero/sign extend here. >>>> >>>> I am assuming here that QI/HI registers are represented in SImode >>>> (basically word_mode) with zero/sign extend is used as in >>>> (zero_extend:SI (subreg:HI (reg:SI 117)). >>> >>> Wouldn't it be better to just set proper flags on the SUBREG based on value >>> range info (SUBREG_PROMOTED_VAR_P and SUBREG_PROMOTED_UNSIGNED_P)? >>> Then not only the optimizers could eliminate in zext/sext when possible, but >>> all other optimizations could benefit from that. >> >> Thanks for the comments. Here is an attempt (attached) that sets >> SUBREG_PROMOTED_VAR_P based on value range into. Is this the good place >> to do this ? > > But you aren't setting it in your patch in any way, you are just resetting > it instead. The thing is, start with a testcase where you get that > (subreg:HI (reg:SI)) as the RTL of some SSA_NAME (is that the case on ARM?, > I believe on e.g. i?86/x86_64 you'd just get (reg:HI) instead and thus you > can't take advantage of that), and at that point where it is created check > the range info and if it is properly sign or zero extended, set > SUBREG_PROMOTED_VAR_P and SUBREG_PROMOTED_UNSIGNED_SET.
Here is another attempt (a quick hack patch is attached). Is this a reasonable direction? I think I will have to look for other places where SUBREG_PROMOTED_UNSIGNED_P are used for possible optimisations. Before that I want to make sure I am on the right track. > Note that right now we use 2 bits for the latter, which encode values > -1 (weirdo pointer extension), 0 (sign extension), 1 (zero extension). > Perhaps it would be nice to allow encoding value 2 (zero and sign extension) > for cases where the range info tells you that the value is both zero and > sign extended (i.e. minimum of range is >= 0 and maximum is <= signed type > maximum). Do you suggest changing rtx_def to achieve this like the following to be able to store 2 in SUBREG_PROMOTED_UNSIGNED_SET? probably not. - unsigned int unchanging : 1; + unsigned int unchanging : 2; Thanks, Kugan
diff --git a/gcc/expr.c b/gcc/expr.c index 2868d9d..15183fa 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -328,7 +328,8 @@ convert_move (rtx to, rtx from, int unsignedp) if (GET_CODE (from) == SUBREG && SUBREG_PROMOTED_VAR_P (from) && (GET_MODE_PRECISION (GET_MODE (SUBREG_REG (from))) >= GET_MODE_PRECISION (to_mode)) - && SUBREG_PROMOTED_UNSIGNED_P (from) == unsignedp) + && (SUBREG_PROMOTED_UNSIGNED_P (from) == 2 + || SUBREG_PROMOTED_UNSIGNED_P (from) == unsignedp)) from = gen_lowpart (to_mode, from), from_mode = to_mode; gcc_assert (GET_CODE (to) != SUBREG || !SUBREG_PROMOTED_VAR_P (to)); @@ -9195,6 +9196,51 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode, } #undef REDUCE_BIT_FIELD +static bool +is_value_extended (tree lhs, enum machine_mode rhs_mode, bool rhs_uns) +{ + wide_int type_min, type_max; + wide_int min, max; + unsigned int prec; + tree lhs_type; + bool lhs_uns; + + if (TREE_CODE (lhs) != SSA_NAME) + return false; + + lhs_type = lang_hooks.types.type_for_mode (rhs_mode, rhs_uns); + lhs_uns = TYPE_UNSIGNED (TREE_TYPE (lhs)); + + /* We remove extension for integrals. */ + if (!INTEGRAL_TYPE_P (TREE_TYPE (lhs))) + return false; + + /* Get the value range. */ + if (POINTER_TYPE_P (TREE_TYPE (lhs)) + || get_range_info (lhs, &min, &max) != VR_RANGE) + return false; + + prec = min.get_precision (); + type_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), prec, TYPE_SIGN (lhs_type)); + type_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), prec, TYPE_SIGN (lhs_type)); + + /* Signedness of LHS and RHS should match. */ + if ((rhs_uns != lhs_uns) + && ((rhs_uns && !wi::neg_p (min, TYPE_SIGN (lhs_type))) + || (!rhs_uns && wi::neg_p (max, TYPE_SIGN (lhs_type))))) + lhs_uns = !lhs_uns; + if (rhs_uns != lhs_uns) + return false; + + /* If rhs value range fits lhs type, extension is redundant. */ + if (wi::cmp (max, type_max, TYPE_SIGN (lhs_type)) != 1 + && (wi::cmp (type_min, min, TYPE_SIGN (lhs_type))) != 1) + return true; + + return false; +} + + /* Return TRUE if expression STMT is suitable for replacement. Never consider memory loads as replaceable, because those don't ever lead @@ -9498,7 +9544,11 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode, temp = gen_lowpart_SUBREG (mode, decl_rtl); SUBREG_PROMOTED_VAR_P (temp) = 1; - SUBREG_PROMOTED_UNSIGNED_SET (temp, unsignedp); + if (is_value_extended (ssa_name, mode, !unsignedp)) + SUBREG_PROMOTED_UNSIGNED_SET (temp, 2); + else + SUBREG_PROMOTED_UNSIGNED_SET (temp, unsignedp); + return temp; } diff --git a/gcc/rtl.h b/gcc/rtl.h index 10ae1e9..f34a7b5 100644 --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -299,7 +299,7 @@ struct GTY((chain_next ("RTX_NEXT (&%h)"), 1 in a CONCAT is VAL_EXPR_IS_CLOBBERED in var-tracking.c. 1 in a preserved VALUE is PRESERVED_VALUE_P in cselib.c. 1 in a clobber temporarily created for LRA. */ - unsigned int unchanging : 1; + unsigned int unchanging : 2; /* 1 in a MEM or ASM_OPERANDS expression if the memory reference is volatile. 1 in an INSN, CALL_INSN, JUMP_INSN, CODE_LABEL, BARRIER, or NOTE if it has been deleted.