On 08/18/2015 06:54 AM, Jiong Wang wrote:
Changes are:
1. s/shfit/shift/
2. Re-write the comment.
more explanations on the left shift across word size boundary
scenario, explan gcc's current expand steps and what this
optimization can improve.
3. Match the code to the comment.
Expand high part first, then low part, so we don't need to check
the pseudo register overlapping.
4. Add the check to make sure if we are shifting the whole source
operand into high part (sfhit amount >= word mode size) then just
don't do this optimization, use the default expand logic instead
which generate optimized rtx sequences already.
5. Only do this optimization when
GET_MOZE_SIZE (mode) == 2 * GET_MODE_SIZE (word_mode)
bootstrap OK on x86-64, no regression.
bootstrap OK on aarch64, no regression.
2015-08-18 Jiong.Wang<jiong.w...@arm.com>
gcc/
* expr.c (expand_expr_real_2): Check gimple statement during
LSHIFT_EXPR expand.
gcc/testsuite
* gcc.dg/wide_shift_64_1.c: New testcase.
* gcc.dg/wide_shift_128_1.c: Likewise.
* gcc.target/aarch64/ashlti3_1.c: Likewise.
* gcc.target/arm/ashldisi_1.c: Likewise.
-- Regards, Jiong
k-new.patch
diff --git a/gcc/expr.c b/gcc/expr.c
index 3202f55..eca9a20 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8836,23 +8836,110 @@ expand_expr_real_2 (sepops ops, rtx target,
machine_mode tmode,
case LSHIFT_EXPR:
case RSHIFT_EXPR:
- /* If this is a fixed-point operation, then we cannot use the code
- below because "expand_shift" doesn't support sat/no-sat fixed-point
- shifts. */
- if (ALL_FIXED_POINT_MODE_P (mode))
- goto binop;
+ {
+ /* If this is a fixed-point operation, then we cannot use the code
+ below because "expand_shift" doesn't support sat/no-sat fixed-point
+ shifts. */
+ if (ALL_FIXED_POINT_MODE_P (mode))
+ goto binop;
+
+ if (! safe_from_p (subtarget, treeop1, 1))
+ subtarget = 0;
+ if (modifier == EXPAND_STACK_PARM)
+ target = 0;
+ op0 = expand_expr (treeop0, subtarget,
+ VOIDmode, EXPAND_NORMAL);
+ /* Left shift optimization when shifting across word_size boundary.
Please insert a newline before this comment.
+
+ If mode == GET_MODE_WIDER_MODE (word_mode), then normally there isn't
+ native instruction to support this wide mode left shift. Given below
+ scenario:
+
+ Type A = (Type) B << C
+
+ |< T >|
+ | dest_high | dest_low |
+
+ | word_size |
+
+ If the shfit amount C caused we shift B to across the word size
s/shfit/shift/
+
+ While, by checking gimple statements, if operand B is coming from
+ signed extension, then we can simplify above expand logic into:
+
+ 1. dest_high = src_low >> (word_size - C).
+ 2. dest_low = src_low << C.
+
+ We can use one arithmetic right shift to finish all the purpose of
+ steps 2, 4, 5, 6, thus we reduce the steps needed from 6 into 2. */
+
+ temp = NULL_RTX;
+ if (code == LSHIFT_EXPR
+ && target
+ && REG_P (target)
+ && ! unsignedp
+ && mode == GET_MODE_WIDER_MODE (word_mode)
+ && GET_MODE_SIZE (mode) == 2 * GET_MODE_SIZE (word_mode)
+ && ! have_insn_for (ASHIFT, mode)
+ && TREE_CONSTANT (treeop1)
+ && TREE_CODE (treeop0) == SSA_NAME)
+ {
+ gimple def = SSA_NAME_DEF_STMT (treeop0);
+ if (is_gimple_assign (def)
+ && gimple_assign_rhs_code (def) == NOP_EXPR)
Don't you need to check that the conversion is actually a sign
extension. Oh, you're relying on the signedness of ops->type. That
should be sufficient.
+ {
+ machine_mode rmode = TYPE_MODE
+ (TREE_TYPE (gimple_assign_rhs1 (def)));
- if (! safe_from_p (subtarget, treeop1, 1))
- subtarget = 0;
- if (modifier == EXPAND_STACK_PARM)
- target = 0;
- op0 = expand_expr (treeop0, subtarget,
- VOIDmode, EXPAND_NORMAL);
- temp = expand_variable_shift (code, mode, op0, treeop1, target,
- unsignedp);
- if (code == LSHIFT_EXPR)
- temp = REDUCE_BIT_FIELD (temp);
- return temp;
+ if (GET_MODE_SIZE (rmode) < GET_MODE_SIZE (mode)
+ && TREE_INT_CST_LOW (treeop1) < GET_MODE_BITSIZE (word_mode)
+ && ((TREE_INT_CST_LOW (treeop1) + GET_MODE_BITSIZE (rmode))
+ >= GET_MODE_BITSIZE (word_mode)))
I keep having trouble with this. I think the key to why this works is
you don't actually use the source of the conversion, but instead the
destination of the conversion (held in op0).
So this is OK for the trunk with the typo & whitespace nits fixed.
Thanks for your patience,
Jeff