On Fri, Jun 29, 2018 at 11:20 AM Richard Sandiford <richard.sandif...@arm.com> wrote: > > Richard Sandiford <richard.sandif...@arm.com> writes: > > This patch adds helper functions for detecting widened operations and > > generalises the existing code to handle more cases. > > > > One of the main changes is to recognise multi-stage type conversions, > > which are possible even in the original IR and can also occur as a > > result of earlier pattern matching (especially after the main > > over-widening patch). E.g. for: > > > > unsigned int res = 0; > > for (__INTPTR_TYPE__ i = 0; i < N; ++i) > > { > > int av = a[i]; > > int bv = b[i]; > > short diff = av - bv; > > unsigned short abs = diff < 0 ? -diff : diff; > > res += abs; > > } > > > > we have: > > > > _9 = _7 - _8; > > diff_20 = (short int) _9; > > _10 = (int) diff_20; > > _11 = ABS_EXPR <_10>; > > > > where the first cast establishes the sign of the promotion done > > by the second cast. > > > > vect_recog_sad_pattern didn't handle this kind of intermediate promotion > > between the MINUS_EXPR and the ABS_EXPR. Sign extensions and casts from > > unsigned to signed are both OK there. Unsigned promotions aren't, and > > need to be rejected, but should have been folded away earlier anyway. > > > > Also, the dot_prod and widen_sum patterns both required the promotions > > to be from one signedness to the same signedness, rather than say signed > > char to unsigned int. That shouldn't be necessary, since it's only the > > sign of the input to the promotion that matters. Nothing requires the > > narrow and wide types in a DOT_PROD_EXPR or WIDEN_SUM_EXPR to have the > > same sign (and IMO that's a good thing). > > > > Fixing these fixed an XFAIL in gcc.dg/vect/vect-widen-mult-sum.c. > > > > The patch also uses a common routine to handle both the WIDEN_MULT_EXPR > > and WIDEN_LSHIFT_EXPR patterns. I hope this could be reused for other > > similar operations in future. > > > > Also, the patch means we recognise the index calculations in > > vect-mult-const-pattern*.c as widening multiplications, whereas the > > scan test was expecting them to be recognised as mult patterns instead. > > The patch makes the tests check specifically for the multiplication we > > care about. > > > > Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? > > It turned out that it would be better to generalise vect_widened_op_p > to handle a tree of operations, for the benefit of the average detection > itself. > > Tested as before. OK to install?
OK. Thanks, Richard. > Richard > > > 2018-06-29 Richard Sandiford <richard.sandif...@arm.com> > > gcc/ > * tree-vect-patterns.c (append_pattern_def_seq): Take an optional > vector type. If given, install it in the new statement's > STMT_VINFO_VECTYPE. > (vect_element_precision): New function. > (vect_unpromoted_value): New struct. > (vect_unpromoted_value::vect_unpromoted_value): New function. > (vect_unpromoted_value::set_op): Likewise. > (vect_look_through_possible_promotion): Likewise. > (vect_joust_widened_integer, vect_joust_widened_type): Likewise. > (vect_widened_op_tree, vect_convert_input): Likewise. > (vect_convert_inputs, vect_convert_output): Likewise. > (vect_recog_dot_prod_pattern): Use > vect_look_through_possible_promotion > to handle the optional cast of the multiplication result and > vect_widened_op_tree to detect the widened multiplication itself. > Do not require the input and output of promotion casts to have > the same sign, but base the signedness of the operation on the > input rather than the result. If the pattern includes two > promotions, check that those promotions have the same sign. > Do not restrict the MULT_EXPR handling to a double-width result; > handle quadruple-width results and wider. Use vect_convert_inputs > to convert the inputs to the common type. > (vect_recog_sad_pattern): Use vect_look_through_possible_promotion > to handle the optional cast of the ABS result. Also allow a sign > change or a sign extension between the ABS and MINUS. > Use vect_widened_op_tree to detect the widened subtraction and use > vect_convert_inputs to convert the inputs to the common type. > (vect_handle_widen_op_by_const): Delete. > (vect_recog_widen_op_pattern): New function. > (vect_recog_widen_mult_pattern): Use it. > (vect_recog_widen_shift_pattern): Likewise. > (vect_recog_widen_sum_pattern): Use > vect_look_through_possible_promotion to handle the promoted > PLUS_EXPR operand. > > gcc/testsuite/ > * gcc.dg/vect/vect-widen-mult-sum.c: Remove xfail. > * gcc.dg/vect/no-scevccp-outer-6.c: Don't match widened > multiplications > by 4 in the computation of a[i]. > * gcc.dg/vect/vect-mult-const-pattern-1.c: Test specifically for the > main multiplication constant. > * gcc.dg/vect/vect-mult-const-pattern-2.c: Likewise. > * gcc.dg/vect/vect-widen-mult-const-s16.c: Likewise. > * gcc.dg/vect/vect-widen-mult-const-u16.c: Likewise. Expect the > pattern to cast the result to int. > * gcc.dg/vect/vect-reduc-dot-1.c: New test. > * gcc.dg/vect/vect-reduc-dot-2.c: Likewise. > * gcc.dg/vect/vect-reduc-dot-3.c: Likewise. > * gcc.dg/vect/vect-reduc-dot-4.c: Likewise. > * gcc.dg/vect/vect-reduc-dot-5.c: Likewise. > * gcc.dg/vect/vect-reduc-dot-6.c: Likewise. > * gcc.dg/vect/vect-reduc-dot-7.c: Likewise. > * gcc.dg/vect/vect-reduc-dot-8.c: Likewise. > * gcc.dg/vect/vect-reduc-sad-1.c: Likewise. > * gcc.dg/vect/vect-reduc-sad-2.c: Likewise. > * gcc.dg/vect/vect-reduc-sad-3.c: Likewise. > * gcc.dg/vect/vect-reduc-sad-4.c: Likewise. > * gcc.dg/vect/vect-reduc-sad-5.c: Likewise. > * gcc.dg/vect/vect-reduc-sad-6.c: Likewise. > * gcc.dg/vect/vect-reduc-sad-7.c: Likewise. > * gcc.dg/vect/vect-reduc-sad-8.c: Likewise. > * gcc.dg/vect/vect-widen-mult-1.c: Likewise. > * gcc.dg/vect/vect-widen-mult-2.c: Likewise. > * gcc.dg/vect/vect-widen-mult-3.c: Likewise. > * gcc.dg/vect/vect-widen-mult-4.c: Likewise. > > Index: gcc/tree-vect-patterns.c > =================================================================== > *** gcc/tree-vect-patterns.c 2018-06-29 10:15:12.000000000 +0100 > --- gcc/tree-vect-patterns.c 2018-06-29 10:19:20.000000000 +0100 > *************** vect_set_pattern_stmt (gimple *pattern_s > *** 96,106 **** > vect_init_pattern_stmt (pattern_stmt, orig_stmt_info, vectype); > } > > static inline void > ! append_pattern_def_seq (stmt_vec_info stmt_info, gimple *stmt) > { > gimple_seq_add_stmt_without_update (&STMT_VINFO_PATTERN_DEF_SEQ > (stmt_info), > ! stmt); > } > > static inline void > --- 96,119 ---- > vect_init_pattern_stmt (pattern_stmt, orig_stmt_info, vectype); > } > > + /* Add NEW_STMT to STMT_INFO's pattern definition statements. If VECTYPE > + is nonnull, record that NEW_STMT's vector type is VECTYPE, which might > + be different from the vector type of the final pattern statement. */ > + > static inline void > ! append_pattern_def_seq (stmt_vec_info stmt_info, gimple *new_stmt, > ! tree vectype = NULL_TREE) > { > + vec_info *vinfo = stmt_info->vinfo; > + if (vectype) > + { > + gcc_assert (!vinfo_for_stmt (new_stmt)); > + stmt_vec_info new_stmt_info = new_stmt_vec_info (new_stmt, vinfo); > + set_vinfo_for_stmt (new_stmt, new_stmt_info); > + STMT_VINFO_VECTYPE (new_stmt_info) = vectype; > + } > gimple_seq_add_stmt_without_update (&STMT_VINFO_PATTERN_DEF_SEQ > (stmt_info), > ! new_stmt); > } > > static inline void > *************** vect_single_imm_use (gimple *def_stmt) > *** 187,192 **** > --- 200,214 ---- > return use_stmt; > } > > + /* Round bit precision PRECISION up to a full element. */ > + > + static unsigned int > + vect_element_precision (unsigned int precision) > + { > + precision = 1 << ceil_log2 (precision); > + return MAX (precision, BITS_PER_UNIT); > + } > + > /* If OP is defined by a statement that's being considered for > vectorization, > return information about that statement, otherwise return NULL. */ > > *************** type_conversion_p (tree name, gimple *us > *** 253,258 **** > --- 275,596 ---- > return true; > } > > + /* Holds information about an input operand after some sign changes > + and type promotions have been peeled away. */ > + struct vect_unpromoted_value { > + vect_unpromoted_value (); > + > + void set_op (tree, vect_def_type, stmt_vec_info = NULL); > + > + /* The value obtained after peeling away zero or more casts. */ > + tree op; > + > + /* The type of OP. */ > + tree type; > + > + /* The definition type of OP. */ > + vect_def_type dt; > + > + /* If OP is the result of peeling at least one cast, and if the cast > + of OP itself is a vectorizable statement, CASTER identifies that > + statement, otherwise it is null. */ > + stmt_vec_info caster; > + }; > + > + inline vect_unpromoted_value::vect_unpromoted_value () > + : op (NULL_TREE), > + type (NULL_TREE), > + dt (vect_uninitialized_def), > + caster (NULL) > + { > + } > + > + /* Set the operand to OP_IN, its definition type to DT_IN, and the > + statement that casts it to CASTER_IN. */ > + > + inline void > + vect_unpromoted_value::set_op (tree op_in, vect_def_type dt_in, > + stmt_vec_info caster_in) > + { > + op = op_in; > + type = TREE_TYPE (op); > + dt = dt_in; > + caster = caster_in; > + } > + > + /* If OP is a vectorizable SSA name, strip a sequence of integer conversions > + to reach some vectorizable inner operand OP', continuing as long as it > + is possible to convert OP' back to OP using a possible sign change > + followed by a possible promotion P. Return this OP', or null if OP is > + not a vectorizable SSA name. If there is a promotion P, describe its > + input in UNPROM, otherwise describe OP' in UNPROM. > + > + A successful return means that it is possible to go from OP' to OP > + via UNPROM. The cast from OP' to UNPROM is at most a sign change, > + whereas the cast from UNPROM to OP might be a promotion, a sign > + change, or a nop. > + > + E.g. say we have: > + > + signed short *ptr = ...; > + signed short C = *ptr; > + unsigned short B = (unsigned short) C; // sign change > + signed int A = (signed int) B; // unsigned promotion > + ...possible other uses of A... > + unsigned int OP = (unsigned int) A; // sign change > + > + In this case it's possible to go directly from C to OP using: > + > + OP = (unsigned int) (unsigned short) C; > + +------------+ +--------------+ > + promotion sign change > + > + so OP' would be C. The input to the promotion is B, so UNPROM > + would describe B. */ > + > + static tree > + vect_look_through_possible_promotion (vec_info *vinfo, tree op, > + vect_unpromoted_value *unprom) > + { > + tree res = NULL_TREE; > + tree op_type = TREE_TYPE (op); > + unsigned int orig_precision = TYPE_PRECISION (op_type); > + stmt_vec_info caster = NULL; > + while (TREE_CODE (op) == SSA_NAME && INTEGRAL_TYPE_P (op_type)) > + { > + /* See whether OP is simple enough to vectorize. */ > + gimple *def_stmt; > + vect_def_type dt; > + if (!vect_is_simple_use (op, vinfo, &dt, &def_stmt)) > + break; > + > + /* If OP is the input of a demotion, skip over it to see whether > + OP is itself the result of a promotion. If so, the combined > + effect of the promotion and the demotion might fit the required > + pattern, otherwise neither operation fits. > + > + This copes with cases such as the result of an arithmetic > + operation being truncated before being stored, and where that > + arithmetic operation has been recognized as an over-widened one. */ > + if (TYPE_PRECISION (op_type) <= orig_precision) > + { > + /* Use OP as the UNPROM described above if we haven't yet > + found a promotion, or if using the new input preserves the > + sign of the previous promotion. */ > + if (!res > + || TYPE_PRECISION (unprom->type) == orig_precision > + || TYPE_SIGN (unprom->type) == TYPE_SIGN (op_type)) > + unprom->set_op (op, dt, caster); > + /* Stop if we've already seen a promotion and if this > + conversion does more than change the sign. */ > + else if (TYPE_PRECISION (op_type) > + != TYPE_PRECISION (unprom->type)) > + break; > + > + /* The sequence now extends to OP. */ > + res = op; > + } > + > + /* See whether OP is defined by a cast. Record it as CASTER if > + the cast is potentially vectorizable. */ > + if (!def_stmt) > + break; > + if (dt == vect_internal_def) > + caster = vinfo_for_stmt (def_stmt); > + else > + caster = NULL; > + gassign *assign = dyn_cast <gassign *> (def_stmt); > + if (!assign || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code > (def_stmt))) > + break; > + > + /* Continue with the input to the cast. */ > + op = gimple_assign_rhs1 (def_stmt); > + op_type = TREE_TYPE (op); > + } > + return res; > + } > + > + /* OP is an integer operand to an operation that returns TYPE, and we > + want to treat the operation as a widening one. So far we can treat > + it as widening from *COMMON_TYPE. > + > + Return true if OP is suitable for such a widening operation, > + either widening from *COMMON_TYPE or from some supertype of it. > + Update *COMMON_TYPE to the supertype in the latter case. > + > + SHIFT_P is true if OP is a shift amount. */ > + > + static bool > + vect_joust_widened_integer (tree type, bool shift_p, tree op, > + tree *common_type) > + { > + /* Calculate the minimum precision required by OP, without changing > + the sign of either operand. */ > + unsigned int precision; > + if (shift_p) > + { > + if (!wi::leu_p (wi::to_widest (op), TYPE_PRECISION (type) / 2)) > + return false; > + precision = TREE_INT_CST_LOW (op); > + } > + else > + { > + precision = wi::min_precision (wi::to_widest (op), > + TYPE_SIGN (*common_type)); > + if (precision * 2 > TYPE_PRECISION (type)) > + return false; > + } > + > + /* If OP requires a wider type, switch to that type. The checks > + above ensure that this is still narrower than the result. */ > + precision = vect_element_precision (precision); > + if (TYPE_PRECISION (*common_type) < precision) > + *common_type = build_nonstandard_integer_type > + (precision, TYPE_UNSIGNED (*common_type)); > + return true; > + } > + > + /* Return true if the common supertype of NEW_TYPE and *COMMON_TYPE > + is narrower than type, storing the supertype in *COMMON_TYPE if so. */ > + > + static bool > + vect_joust_widened_type (tree type, tree new_type, tree *common_type) > + { > + if (types_compatible_p (*common_type, new_type)) > + return true; > + > + /* See if *COMMON_TYPE can hold all values of NEW_TYPE. */ > + if ((TYPE_PRECISION (new_type) < TYPE_PRECISION (*common_type)) > + && (TYPE_UNSIGNED (new_type) || !TYPE_UNSIGNED (*common_type))) > + return true; > + > + /* See if NEW_TYPE can hold all values of *COMMON_TYPE. */ > + if (TYPE_PRECISION (*common_type) < TYPE_PRECISION (new_type) > + && (TYPE_UNSIGNED (*common_type) || !TYPE_UNSIGNED (new_type))) > + { > + *common_type = new_type; > + return true; > + } > + > + /* We have mismatched signs, with the signed type being > + no wider than the unsigned type. In this case we need > + a wider signed type. */ > + unsigned int precision = MAX (TYPE_PRECISION (*common_type), > + TYPE_PRECISION (new_type)); > + precision *= 2; > + if (precision * 2 > TYPE_PRECISION (type)) > + return false; > + > + *common_type = build_nonstandard_integer_type (precision, false); > + return true; > + } > + > + /* Check whether STMT_INFO can be viewed as a tree of integer operations > + in which each node either performs CODE or WIDENED_CODE, and where > + each leaf operand is narrower than the result of STMT_INFO. MAX_NOPS > + specifies the maximum number of leaf operands. SHIFT_P says whether > + CODE and WIDENED_CODE are some sort of shift. > + > + If STMT_INFO is such a tree, return the number of leaf operands > + and describe them in UNPROM[0] onwards. Also set *COMMON_TYPE > + to a type that (a) is narrower than the result of STMT_INFO and > + (b) can hold all leaf operand values. > + > + Return 0 if STMT_INFO isn't such a tree, or if no such COMMON_TYPE > + exists. */ > + > + static unsigned int > + vect_widened_op_tree (stmt_vec_info stmt_info, tree_code code, > + tree_code widened_code, bool shift_p, > + unsigned int max_nops, > + vect_unpromoted_value *unprom, tree *common_type) > + { > + /* Check for an integer operation with the right code. */ > + gassign *assign = dyn_cast <gassign *> (stmt_info->stmt); > + if (!assign) > + return 0; > + > + tree_code rhs_code = gimple_assign_rhs_code (assign); > + if (rhs_code != code && rhs_code != widened_code) > + return 0; > + > + tree type = gimple_expr_type (assign); > + if (!INTEGRAL_TYPE_P (type)) > + return 0; > + > + /* Assume that both operands will be leaf operands. */ > + max_nops -= 2; > + > + /* Check the operands. */ > + unsigned int next_op = 0; > + for (unsigned int i = 0; i < 2; ++i) > + { > + vect_unpromoted_value *this_unprom = &unprom[next_op]; > + unsigned int nops = 1; > + tree op = gimple_op (assign, i + 1); > + if (i == 1 && TREE_CODE (op) == INTEGER_CST) > + { > + /* We already have a common type from earlier operands. > + Update it to account for OP. */ > + this_unprom->set_op (op, vect_constant_def); > + if (!vect_joust_widened_integer (type, shift_p, op, common_type)) > + return 0; > + } > + else > + { > + /* Only allow shifts by constants. */ > + if (shift_p && i == 1) > + return 0; > + > + if (!vect_look_through_possible_promotion (stmt_info->vinfo, op, > + this_unprom)) > + return 0; > + > + if (TYPE_PRECISION (this_unprom->type) == TYPE_PRECISION (type)) > + { > + /* The operand isn't widened. If STMT_INFO has the code > + for an unwidened operation, recursively check whether > + this operand is a node of the tree. */ > + if (rhs_code != code > + || max_nops == 0 > + || this_unprom->dt != vect_internal_def) > + return 0; > + > + /* Give back the leaf slot allocated above now that we're > + not treating this as a leaf operand. */ > + max_nops += 1; > + > + /* Recursively process the definition of the operand. */ > + stmt_vec_info def_stmt_info > + = vinfo_for_stmt (SSA_NAME_DEF_STMT (this_unprom->op)); > + nops = vect_widened_op_tree (def_stmt_info, code, widened_code, > + shift_p, max_nops, this_unprom, > + common_type); > + if (nops == 0) > + return 0; > + > + max_nops -= nops; > + } > + else > + { > + /* Make sure that the operand is narrower than the result. */ > + if (TYPE_PRECISION (this_unprom->type) * 2 > + > TYPE_PRECISION (type)) > + return 0; > + > + /* Update COMMON_TYPE for the new operand. */ > + if (i == 0) > + *common_type = this_unprom->type; > + else if (!vect_joust_widened_type (type, this_unprom->type, > + common_type)) > + return 0; > + } > + } > + next_op += nops; > + } > + return next_op; > + } > + > /* Helper to return a new temporary for pattern of TYPE for STMT. If STMT > is NULL, the caller must set SSA_NAME_DEF_STMT for the returned SSA var. > */ > > *************** vect_recog_temp_ssa_var (tree type, gimp > *** 262,267 **** > --- 600,680 ---- > return make_temp_ssa_name (type, stmt, "patt"); > } > > + /* Convert UNPROM to TYPE and return the result, adding new statements > + to STMT_INFO's pattern definition statements if no better way is > + available. VECTYPE is the vector form of TYPE. */ > + > + static tree > + vect_convert_input (stmt_vec_info stmt_info, tree type, > + vect_unpromoted_value *unprom, tree vectype) > + { > + /* Check for a no-op conversion. */ > + if (types_compatible_p (type, TREE_TYPE (unprom->op))) > + return unprom->op; > + > + /* Allow the caller to create constant vect_unpromoted_values. */ > + if (TREE_CODE (unprom->op) == INTEGER_CST) > + return wide_int_to_tree (type, wi::to_widest (unprom->op)); > + > + /* See if we can reuse an existing result. */ > + if (unprom->caster) > + { > + tree lhs = gimple_get_lhs (unprom->caster->stmt); > + if (types_compatible_p (TREE_TYPE (lhs), type)) > + return lhs; > + } > + > + /* We need a new conversion statement. */ > + tree new_op = vect_recog_temp_ssa_var (type, NULL); > + gassign *new_stmt = gimple_build_assign (new_op, NOP_EXPR, unprom->op); > + > + /* As a (common) last resort, add the statement to the pattern itself. */ > + append_pattern_def_seq (stmt_info, new_stmt, vectype); > + return new_op; > + } > + > + /* Invoke vect_convert_input for N elements of UNPROM and store the > + result in the corresponding elements of RESULT. */ > + > + static void > + vect_convert_inputs (stmt_vec_info stmt_info, unsigned int n, > + tree *result, tree type, vect_unpromoted_value *unprom, > + tree vectype) > + { > + for (unsigned int i = 0; i < n; ++i) > + { > + unsigned int j; > + for (j = 0; j < i; ++j) > + if (unprom[j].op == unprom[i].op) > + break; > + if (j < i) > + result[i] = result[j]; > + else > + result[i] = vect_convert_input (stmt_info, type, &unprom[i], vectype); > + } > + } > + > + /* The caller has created a (possibly empty) sequence of pattern definition > + statements followed by a single statement PATTERN_STMT. Cast the result > + of this final statement to TYPE. If a new statement is needed, add > + PATTERN_STMT to the end of STMT_INFO's pattern definition statements > + and return the new statement, otherwise return PATTERN_STMT as-is. > + VECITYPE is the vector form of PATTERN_STMT's result type. */ > + > + static gimple * > + vect_convert_output (stmt_vec_info stmt_info, tree type, gimple > *pattern_stmt, > + tree vecitype) > + { > + tree lhs = gimple_get_lhs (pattern_stmt); > + if (!types_compatible_p (type, TREE_TYPE (lhs))) > + { > + append_pattern_def_seq (stmt_info, pattern_stmt, vecitype); > + tree cast_var = vect_recog_temp_ssa_var (type, NULL); > + pattern_stmt = gimple_build_assign (cast_var, NOP_EXPR, lhs); > + } > + return pattern_stmt; > + } > + > /* Return true if STMT_VINFO describes a reduction for which reassociation > is allowed. If STMT_INFO is part of a group, assume that it's part of > a reduction chain and optimistically assume that all statements > *************** vect_reassociating_reduction_p (stmt_vec > *** 351,366 **** > static gimple * > vect_recog_dot_prod_pattern (vec<gimple *> *stmts, tree *type_out) > { > ! gimple *stmt, *last_stmt = (*stmts)[0]; > tree oprnd0, oprnd1; > - tree oprnd00, oprnd01; > stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); > vec_info *vinfo = stmt_vinfo->vinfo; > tree type, half_type; > gimple *pattern_stmt; > - tree prod_type; > tree var; > - bool promotion; > > /* Look for the following pattern > DX = (TYPE1) X; > --- 764,776 ---- > static gimple * > vect_recog_dot_prod_pattern (vec<gimple *> *stmts, tree *type_out) > { > ! gimple *last_stmt = (*stmts)[0]; > tree oprnd0, oprnd1; > stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); > vec_info *vinfo = stmt_vinfo->vinfo; > tree type, half_type; > gimple *pattern_stmt; > tree var; > > /* Look for the following pattern > DX = (TYPE1) X; > *************** vect_recog_dot_prod_pattern (vec<gimple > *** 391,469 **** > return NULL; > > type = gimple_expr_type (last_stmt); > - stmt = last_stmt; > > ! gimple *def_stmt; > ! if (type_conversion_p (oprnd0, stmt, true, &half_type, &def_stmt, > ! &promotion) > ! && promotion) > ! { > ! stmt = def_stmt; > ! oprnd0 = gimple_assign_rhs1 (stmt); > ! } > ! else > ! half_type = type; > > /* So far so good. Since last_stmt was detected as a (summation) > reduction, > we know that oprnd1 is the reduction variable (defined by a loop-header > phi), and oprnd0 is an ssa-name defined by a stmt in the loop body. > Left to check that oprnd0 is defined by a (widen_)mult_expr */ > ! if (TREE_CODE (oprnd0) != SSA_NAME) > return NULL; > > - prod_type = half_type; > stmt_vec_info mult_vinfo = vect_get_internal_def (vinfo, oprnd0); > if (!mult_vinfo) > return NULL; > > /* FORNOW. Can continue analyzing the def-use chain when this stmt in a > phi > inside the loop (in case we are analyzing an outer-loop). */ > ! gassign *mult = dyn_cast <gassign *> (mult_vinfo->stmt); > ! if (!mult) > return NULL; > - if (gimple_assign_rhs_code (mult) == WIDEN_MULT_EXPR) > - { > - /* Has been detected as a widening multiplication? */ > - oprnd00 = gimple_assign_rhs1 (mult); > - oprnd01 = gimple_assign_rhs2 (mult); > - } > - else > - { > - tree half_type0, half_type1; > - gimple *def_stmt; > - tree oprnd0, oprnd1; > > ! if (gimple_assign_rhs_code (mult) != MULT_EXPR) > ! return NULL; > ! > ! oprnd0 = gimple_assign_rhs1 (mult); > ! oprnd1 = gimple_assign_rhs2 (mult); > ! if (!type_conversion_p (oprnd0, mult, true, &half_type0, &def_stmt, > ! &promotion) > ! || !promotion) > ! return NULL; > ! oprnd00 = gimple_assign_rhs1 (def_stmt); > ! if (!type_conversion_p (oprnd1, mult, true, &half_type1, &def_stmt, > ! &promotion) > ! || !promotion) > ! return NULL; > ! oprnd01 = gimple_assign_rhs1 (def_stmt); > ! if (!types_compatible_p (half_type0, half_type1)) > ! return NULL; > ! if (TYPE_PRECISION (prod_type) != TYPE_PRECISION (half_type0) * 2) > ! return NULL; > ! } > > vect_pattern_detected ("vect_recog_dot_prod_pattern", last_stmt); > > ! half_type = TREE_TYPE (oprnd00); > if (!vect_supportable_direct_optab_p (type, DOT_PROD_EXPR, half_type, > ! type_out)) > return NULL; > > var = vect_recog_temp_ssa_var (type, NULL); > pattern_stmt = gimple_build_assign (var, DOT_PROD_EXPR, > ! oprnd00, oprnd01, oprnd1); > > return pattern_stmt; > } > --- 801,850 ---- > return NULL; > > type = gimple_expr_type (last_stmt); > > ! vect_unpromoted_value unprom_mult; > ! oprnd0 = vect_look_through_possible_promotion (vinfo, oprnd0, > &unprom_mult); > > /* So far so good. Since last_stmt was detected as a (summation) > reduction, > we know that oprnd1 is the reduction variable (defined by a loop-header > phi), and oprnd0 is an ssa-name defined by a stmt in the loop body. > Left to check that oprnd0 is defined by a (widen_)mult_expr */ > ! if (!oprnd0) > return NULL; > > stmt_vec_info mult_vinfo = vect_get_internal_def (vinfo, oprnd0); > if (!mult_vinfo) > return NULL; > > /* FORNOW. Can continue analyzing the def-use chain when this stmt in a > phi > inside the loop (in case we are analyzing an outer-loop). */ > ! vect_unpromoted_value unprom0[2]; > ! if (!vect_widened_op_tree (mult_vinfo, MULT_EXPR, WIDEN_MULT_EXPR, > ! false, 2, unprom0, &half_type)) > return NULL; > > ! /* If there are two widening operations, make sure they agree on > ! the sign of the extension. */ > ! if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type) > ! && TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)) > ! return NULL; > > vect_pattern_detected ("vect_recog_dot_prod_pattern", last_stmt); > > ! tree half_vectype; > if (!vect_supportable_direct_optab_p (type, DOT_PROD_EXPR, half_type, > ! type_out, &half_vectype)) > return NULL; > > + /* Get the inputs in the appropriate types. */ > + STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) = NULL; > + tree mult_oprnd[2]; > + vect_convert_inputs (stmt_vinfo, 2, mult_oprnd, half_type, > + unprom0, half_vectype); > + > var = vect_recog_temp_ssa_var (type, NULL); > pattern_stmt = gimple_build_assign (var, DOT_PROD_EXPR, > ! mult_oprnd[0], mult_oprnd[1], oprnd1); > > return pattern_stmt; > } > *************** vect_recog_dot_prod_pattern (vec<gimple > *** 510,520 **** > vect_recog_sad_pattern (vec<gimple *> *stmts, tree *type_out) > { > gimple *last_stmt = (*stmts)[0]; > - tree sad_oprnd0, sad_oprnd1; > stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); > vec_info *vinfo = stmt_vinfo->vinfo; > tree half_type; > - bool promotion; > > /* Look for the following pattern > DX = (TYPE1) X; > --- 891,899 ---- > *************** vect_recog_sad_pattern (vec<gimple *> *s > *** 550,573 **** > > tree sum_type = gimple_expr_type (last_stmt); > > ! /* The type conversion could be promotion, demotion, > ! or just signed -> unsigned. */ > ! gimple *def_stmt; > ! if (type_conversion_p (plus_oprnd0, last_stmt, false, > ! &half_type, &def_stmt, &promotion)) > ! plus_oprnd0 = gimple_assign_rhs1 (def_stmt); > ! else > ! half_type = sum_type; > > /* So far so good. Since last_stmt was detected as a (summation) > reduction, > we know that plus_oprnd1 is the reduction variable (defined by a > loop-header > phi), and plus_oprnd0 is an ssa-name defined by a stmt in the loop > body. > Then check that plus_oprnd0 is defined by an abs_expr. */ > > ! if (TREE_CODE (plus_oprnd0) != SSA_NAME) > return NULL; > > - tree abs_type = half_type; > stmt_vec_info abs_stmt_vinfo = vect_get_internal_def (vinfo, plus_oprnd0); > if (!abs_stmt_vinfo) > return NULL; > --- 929,951 ---- > > tree sum_type = gimple_expr_type (last_stmt); > > ! /* Any non-truncating sequence of conversions is OK here, since > ! with a successful match, the result of the ABS(U) is known to fit > ! within the nonnegative range of the result type. (It cannot be the > ! negative of the minimum signed value due to the range of the widening > ! MINUS_EXPR.) */ > ! vect_unpromoted_value unprom_abs; > ! plus_oprnd0 = vect_look_through_possible_promotion (vinfo, plus_oprnd0, > ! &unprom_abs); > > /* So far so good. Since last_stmt was detected as a (summation) > reduction, > we know that plus_oprnd1 is the reduction variable (defined by a > loop-header > phi), and plus_oprnd0 is an ssa-name defined by a stmt in the loop > body. > Then check that plus_oprnd0 is defined by an abs_expr. */ > > ! if (!plus_oprnd0) > return NULL; > > stmt_vec_info abs_stmt_vinfo = vect_get_internal_def (vinfo, plus_oprnd0); > if (!abs_stmt_vinfo) > return NULL; > *************** vect_recog_sad_pattern (vec<gimple *> *s > *** 581,914 **** > return NULL; > > tree abs_oprnd = gimple_assign_rhs1 (abs_stmt); > if (TYPE_UNSIGNED (abs_type)) > return NULL; > > ! /* We then detect if the operand of abs_expr is defined by a minus_expr. > */ > ! > ! if (TREE_CODE (abs_oprnd) != SSA_NAME) > return NULL; > > stmt_vec_info diff_stmt_vinfo = vect_get_internal_def (vinfo, abs_oprnd); > if (!diff_stmt_vinfo) > return NULL; > > /* FORNOW. Can continue analyzing the def-use chain when this stmt in a > phi > inside the loop (in case we are analyzing an outer-loop). */ > ! gassign *diff_stmt = dyn_cast <gassign *> (diff_stmt_vinfo->stmt); > ! if (!diff_stmt || gimple_assign_rhs_code (diff_stmt) != MINUS_EXPR) > ! return NULL; > ! > ! tree half_type0, half_type1; > ! > ! tree minus_oprnd0 = gimple_assign_rhs1 (diff_stmt); > ! tree minus_oprnd1 = gimple_assign_rhs2 (diff_stmt); > ! > ! if (!type_conversion_p (minus_oprnd0, diff_stmt, false, > ! &half_type0, &def_stmt, &promotion) > ! || !promotion) > ! return NULL; > ! sad_oprnd0 = gimple_assign_rhs1 (def_stmt); > ! > ! if (!type_conversion_p (minus_oprnd1, diff_stmt, false, > ! &half_type1, &def_stmt, &promotion) > ! || !promotion) > ! return NULL; > ! sad_oprnd1 = gimple_assign_rhs1 (def_stmt); > ! > ! if (!types_compatible_p (half_type0, half_type1)) > ! return NULL; > ! if (TYPE_PRECISION (abs_type) < TYPE_PRECISION (half_type0) * 2 > ! || TYPE_PRECISION (sum_type) < TYPE_PRECISION (half_type0) * 2) > return NULL; > > vect_pattern_detected ("vect_recog_sad_pattern", last_stmt); > > ! if (!vect_supportable_direct_optab_p (sum_type, SAD_EXPR, > ! TREE_TYPE (sad_oprnd0), type_out)) > return NULL; > > tree var = vect_recog_temp_ssa_var (sum_type, NULL); > ! gimple *pattern_stmt = gimple_build_assign (var, SAD_EXPR, sad_oprnd0, > ! sad_oprnd1, plus_oprnd1); > > return pattern_stmt; > } > > > ! /* Handle widening operation by a constant. At the moment we support > MULT_EXPR > ! and LSHIFT_EXPR. > ! > ! For MULT_EXPR we check that CONST_OPRND fits HALF_TYPE, and for > LSHIFT_EXPR > ! we check that CONST_OPRND is less or equal to the size of HALF_TYPE. > ! > ! Otherwise, if the type of the result (TYPE) is at least 4 times bigger > than > ! HALF_TYPE, and there is an intermediate type (2 times smaller than TYPE) > ! that satisfies the above restrictions, we can perform a widening > opeartion > ! from the intermediate type to TYPE and replace a_T = (TYPE) a_t; > ! with a_it = (interm_type) a_t; Store such operation in *WSTMT. */ > > ! static bool > ! vect_handle_widen_op_by_const (gimple *stmt, enum tree_code code, > ! tree const_oprnd, tree *oprnd, > ! gimple **wstmt, tree type, > ! tree *half_type, gimple *def_stmt) > ! { > ! tree new_type, new_oprnd; > > ! if (code != MULT_EXPR && code != LSHIFT_EXPR) > ! return false; > ! > ! if (((code == MULT_EXPR && int_fits_type_p (const_oprnd, *half_type)) > ! || (code == LSHIFT_EXPR > ! && compare_tree_int (const_oprnd, TYPE_PRECISION (*half_type)) > ! != 1)) > ! && TYPE_PRECISION (type) == (TYPE_PRECISION (*half_type) * 2)) > ! { > ! /* CONST_OPRND is a constant of HALF_TYPE. */ > ! *oprnd = gimple_assign_rhs1 (def_stmt); > ! return true; > ! } > ! > ! if (TYPE_PRECISION (type) < (TYPE_PRECISION (*half_type) * 4)) > ! return false; > ! > ! if (!vect_same_loop_or_bb_p (stmt, def_stmt)) > ! return false; > ! > ! /* TYPE is 4 times bigger than HALF_TYPE, try widening operation for > ! a type 2 times bigger than HALF_TYPE. */ > ! new_type = build_nonstandard_integer_type (TYPE_PRECISION (type) / 2, > ! TYPE_UNSIGNED (type)); > ! if ((code == MULT_EXPR && !int_fits_type_p (const_oprnd, new_type)) > ! || (code == LSHIFT_EXPR > ! && compare_tree_int (const_oprnd, TYPE_PRECISION (new_type)) == > 1)) > ! return false; > ! > ! /* Use NEW_TYPE for widening operation and create a_T = (NEW_TYPE) a_t; > */ > ! *oprnd = gimple_assign_rhs1 (def_stmt); > ! new_oprnd = make_ssa_name (new_type); > ! *wstmt = gimple_build_assign (new_oprnd, NOP_EXPR, *oprnd); > ! *oprnd = new_oprnd; > ! > ! *half_type = new_type; > ! return true; > ! } > ! > ! > ! /* Function vect_recog_widen_mult_pattern > ! > ! Try to find the following pattern: > ! > ! type1 a_t; > ! type2 b_t; > ! TYPE a_T, b_T, prod_T; > ! > ! S1 a_t = ; > ! S2 b_t = ; > ! S3 a_T = (TYPE) a_t; > ! S4 b_T = (TYPE) b_t; > ! S5 prod_T = a_T * b_T; > ! > ! where type 'TYPE' is at least double the size of type 'type1' and > 'type2'. > ! > ! Also detect unsigned cases: > ! > ! unsigned type1 a_t; > ! unsigned type2 b_t; > ! unsigned TYPE u_prod_T; > ! TYPE a_T, b_T, prod_T; > ! > ! S1 a_t = ; > ! S2 b_t = ; > ! S3 a_T = (TYPE) a_t; > ! S4 b_T = (TYPE) b_t; > ! S5 prod_T = a_T * b_T; > ! S6 u_prod_T = (unsigned TYPE) prod_T; > ! > ! and multiplication by constants: > ! > ! type a_t; > ! TYPE a_T, prod_T; > ! > ! S1 a_t = ; > ! S3 a_T = (TYPE) a_t; > ! S5 prod_T = a_T * CONST; > ! > ! A special case of multiplication by constants is when 'TYPE' is 4 times > ! bigger than 'type', but CONST fits an intermediate type 2 times smaller > ! than 'TYPE'. In that case we create an additional pattern stmt for S3 > ! to create a variable of the intermediate type, and perform widen-mult > ! on the intermediate type as well: > ! > ! type a_t; > ! interm_type a_it; > ! TYPE a_T, prod_T, prod_T'; > ! > ! S1 a_t = ; > ! S3 a_T = (TYPE) a_t; > ! '--> a_it = (interm_type) a_t; > ! S5 prod_T = a_T * CONST; > ! '--> prod_T' = a_it w* CONST; > ! > ! Input/Output: > ! > ! * STMTS: Contains a stmt from which the pattern search begins. In the > ! example, when this function is called with S5, the pattern > {S3,S4,S5,(S6)} > ! is detected. In case of unsigned widen-mult, the original stmt (S5) is > ! replaced with S6 in STMTS. In case of multiplication by a constant > ! of an intermediate type (the last case above), STMTS also contains S3 > ! (inserted before S5). > ! > ! Output: > ! > ! * TYPE_OUT: The type of the output of this pattern. > ! > ! * Return value: A new stmt that will be used to replace the sequence of > ! stmts that constitute the pattern. In this case it will be: > ! WIDEN_MULT <a_t, b_t> > ! If the result of WIDEN_MULT needs to be converted to a larger type, the > ! returned stmt will be this type conversion stmt. > ! */ > > static gimple * > ! vect_recog_widen_mult_pattern (vec<gimple *> *stmts, tree *type_out) > { > gimple *last_stmt = stmts->pop (); > ! gimple *def_stmt0, *def_stmt1; > ! tree oprnd0, oprnd1; > ! tree type, half_type0, half_type1; > ! gimple *new_stmt = NULL, *pattern_stmt = NULL; > ! tree vectype, vecitype; > ! tree var; > ! enum tree_code dummy_code; > ! int dummy_int; > ! vec<tree> dummy_vec; > ! bool op1_ok; > ! bool promotion; > ! > ! if (!is_gimple_assign (last_stmt)) > ! return NULL; > ! > ! type = gimple_expr_type (last_stmt); > > ! /* Starting from LAST_STMT, follow the defs of its uses in search > ! of the above pattern. */ > ! > ! if (gimple_assign_rhs_code (last_stmt) != MULT_EXPR) > return NULL; > > ! oprnd0 = gimple_assign_rhs1 (last_stmt); > ! oprnd1 = gimple_assign_rhs2 (last_stmt); > ! > ! /* Check argument 0. */ > ! if (!type_conversion_p (oprnd0, last_stmt, false, &half_type0, &def_stmt0, > ! &promotion) > ! || !promotion) > ! return NULL; > ! /* Check argument 1. */ > ! op1_ok = type_conversion_p (oprnd1, last_stmt, false, &half_type1, > ! &def_stmt1, &promotion); > ! > ! if (op1_ok && promotion) > ! { > ! oprnd0 = gimple_assign_rhs1 (def_stmt0); > ! oprnd1 = gimple_assign_rhs1 (def_stmt1); > ! } > ! else > ! { > ! if (TREE_CODE (oprnd1) == INTEGER_CST > ! && TREE_CODE (half_type0) == INTEGER_TYPE > ! && vect_handle_widen_op_by_const (last_stmt, MULT_EXPR, oprnd1, > ! &oprnd0, &new_stmt, type, > ! &half_type0, def_stmt0)) > ! { > ! half_type1 = half_type0; > ! oprnd1 = fold_convert (half_type1, oprnd1); > ! } > ! else > ! return NULL; > ! } > ! > ! /* If the two arguments have different sizes, convert the one with > ! the smaller type into the larger type. */ > ! if (TYPE_PRECISION (half_type0) != TYPE_PRECISION (half_type1)) > ! { > ! /* If we already used up the single-stmt slot give up. */ > ! if (new_stmt) > ! return NULL; > ! > ! tree* oprnd = NULL; > ! gimple *def_stmt = NULL; > ! > ! if (TYPE_PRECISION (half_type0) < TYPE_PRECISION (half_type1)) > ! { > ! def_stmt = def_stmt0; > ! half_type0 = half_type1; > ! oprnd = &oprnd0; > ! } > ! else > ! { > ! def_stmt = def_stmt1; > ! half_type1 = half_type0; > ! oprnd = &oprnd1; > ! } > ! > ! tree old_oprnd = gimple_assign_rhs1 (def_stmt); > ! tree new_oprnd = make_ssa_name (half_type0); > ! new_stmt = gimple_build_assign (new_oprnd, NOP_EXPR, old_oprnd); > ! *oprnd = new_oprnd; > ! } > ! > ! /* Handle unsigned case. Look for > ! S6 u_prod_T = (unsigned TYPE) prod_T; > ! Use unsigned TYPE as the type for WIDEN_MULT_EXPR. */ > ! if (TYPE_UNSIGNED (type) != TYPE_UNSIGNED (half_type0)) > ! { > ! gimple *use_stmt; > ! tree use_lhs; > ! tree use_type; > ! > ! if (TYPE_UNSIGNED (type) == TYPE_UNSIGNED (half_type1)) > ! return NULL; > ! > ! use_stmt = vect_single_imm_use (last_stmt); > ! if (!use_stmt || !is_gimple_assign (use_stmt) > ! || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (use_stmt))) > ! return NULL; > ! > ! use_lhs = gimple_assign_lhs (use_stmt); > ! use_type = TREE_TYPE (use_lhs); > ! if (!INTEGRAL_TYPE_P (use_type) > ! || (TYPE_UNSIGNED (type) == TYPE_UNSIGNED (use_type)) > ! || (TYPE_PRECISION (type) != TYPE_PRECISION (use_type))) > ! return NULL; > ! > ! type = use_type; > ! last_stmt = use_stmt; > ! } > ! > ! if (!types_compatible_p (half_type0, half_type1)) > ! return NULL; > > ! /* If TYPE is more than twice larger than HALF_TYPE, we use WIDEN_MULT > ! to get an intermediate result of type ITYPE. In this case we need > ! to build a statement to convert this intermediate result to type TYPE. > */ > tree itype = type; > ! if (TYPE_PRECISION (type) > TYPE_PRECISION (half_type0) * 2) > ! itype = build_nonstandard_integer_type > ! (GET_MODE_BITSIZE (SCALAR_TYPE_MODE (half_type0)) * 2, > ! TYPE_UNSIGNED (type)); > ! > ! /* Pattern detected. */ > ! vect_pattern_detected ("vect_recog_widen_mult_pattern", last_stmt); > > /* Check target support */ > ! vectype = get_vectype_for_scalar_type (half_type0); > ! vecitype = get_vectype_for_scalar_type (itype); > if (!vectype > || !vecitype > ! || !supportable_widening_operation (WIDEN_MULT_EXPR, last_stmt, > vecitype, vectype, > &dummy_code, &dummy_code, > &dummy_int, &dummy_vec)) > --- 959,1072 ---- > return NULL; > > tree abs_oprnd = gimple_assign_rhs1 (abs_stmt); > + tree abs_type = TREE_TYPE (abs_oprnd); > if (TYPE_UNSIGNED (abs_type)) > return NULL; > > ! /* Peel off conversions from the ABS input. This can involve sign > ! changes (e.g. from an unsigned subtraction to a signed ABS input) > ! or signed promotion, but it can't include unsigned promotion. > ! (Note that ABS of an unsigned promotion should have been folded > ! away before now anyway.) */ > ! vect_unpromoted_value unprom_diff; > ! abs_oprnd = vect_look_through_possible_promotion (vinfo, abs_oprnd, > ! &unprom_diff); > ! if (!abs_oprnd) > ! return NULL; > ! if (TYPE_PRECISION (unprom_diff.type) != TYPE_PRECISION (abs_type) > ! && TYPE_UNSIGNED (unprom_diff.type)) > return NULL; > > + /* We then detect if the operand of abs_expr is defined by a minus_expr. > */ > stmt_vec_info diff_stmt_vinfo = vect_get_internal_def (vinfo, abs_oprnd); > if (!diff_stmt_vinfo) > return NULL; > > /* FORNOW. Can continue analyzing the def-use chain when this stmt in a > phi > inside the loop (in case we are analyzing an outer-loop). */ > ! vect_unpromoted_value unprom[2]; > ! if (!vect_widened_op_tree (diff_stmt_vinfo, MINUS_EXPR, MINUS_EXPR, > ! false, 2, unprom, &half_type)) > return NULL; > > vect_pattern_detected ("vect_recog_sad_pattern", last_stmt); > > ! tree half_vectype; > ! if (!vect_supportable_direct_optab_p (sum_type, SAD_EXPR, half_type, > ! type_out, &half_vectype)) > return NULL; > > + /* Get the inputs to the SAD_EXPR in the appropriate types. */ > + STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) = NULL; > + tree sad_oprnd[2]; > + vect_convert_inputs (stmt_vinfo, 2, sad_oprnd, half_type, > + unprom, half_vectype); > + > tree var = vect_recog_temp_ssa_var (sum_type, NULL); > ! gimple *pattern_stmt = gimple_build_assign (var, SAD_EXPR, sad_oprnd[0], > ! sad_oprnd[1], plus_oprnd1); > > return pattern_stmt; > } > > + /* Recognize an operation that performs ORIG_CODE on widened inputs, > + so that it can be treated as though it had the form: > > ! A_TYPE a; > ! B_TYPE b; > ! HALF_TYPE a_cast = (HALF_TYPE) a; // possible no-op > ! HALF_TYPE b_cast = (HALF_TYPE) b; // possible no-op > ! | RES_TYPE a_extend = (RES_TYPE) a_cast; // promotion from HALF_TYPE > ! | RES_TYPE b_extend = (RES_TYPE) b_cast; // promotion from HALF_TYPE > ! | RES_TYPE res = a_extend ORIG_CODE b_extend; > ! > ! Try to replace the pattern with: > ! > ! A_TYPE a; > ! B_TYPE b; > ! HALF_TYPE a_cast = (HALF_TYPE) a; // possible no-op > ! HALF_TYPE b_cast = (HALF_TYPE) b; // possible no-op > ! | EXT_TYPE ext = a_cast WIDE_CODE b_cast; > ! | RES_TYPE res = (EXT_TYPE) ext; // possible no-op > > ! where EXT_TYPE is wider than HALF_TYPE but has the same signedness. > > ! SHIFT_P is true if ORIG_CODE and WIDE_CODE are shifts. NAME is the > ! name of the pattern being matched, for dump purposes. */ > > static gimple * > ! vect_recog_widen_op_pattern (vec<gimple *> *stmts, tree *type_out, > ! tree_code orig_code, tree_code wide_code, > ! bool shift_p, const char *name) > { > gimple *last_stmt = stmts->pop (); > ! stmt_vec_info last_stmt_info = vinfo_for_stmt (last_stmt); > > ! vect_unpromoted_value unprom[2]; > ! tree half_type; > ! if (!vect_widened_op_tree (last_stmt_info, orig_code, orig_code, > ! shift_p, 2, unprom, &half_type)) > return NULL; > > ! /* Pattern detected. */ > ! vect_pattern_detected (name, last_stmt); > > ! tree type = gimple_expr_type (last_stmt); > tree itype = type; > ! if (TYPE_PRECISION (type) != TYPE_PRECISION (half_type) * 2 > ! || TYPE_UNSIGNED (type) != TYPE_UNSIGNED (half_type)) > ! itype = build_nonstandard_integer_type (TYPE_PRECISION (half_type) * 2, > ! TYPE_UNSIGNED (half_type)); > > /* Check target support */ > ! tree vectype = get_vectype_for_scalar_type (half_type); > ! tree vecitype = get_vectype_for_scalar_type (itype); > ! enum tree_code dummy_code; > ! int dummy_int; > ! auto_vec<tree> dummy_vec; > if (!vectype > || !vecitype > ! || !supportable_widening_operation (wide_code, last_stmt, > vecitype, vectype, > &dummy_code, &dummy_code, > &dummy_int, &dummy_vec)) > *************** vect_recog_widen_mult_pattern (vec<gimpl > *** 918,960 **** > if (!*type_out) > return NULL; > > ! /* Pattern supported. Create a stmt to be used to replace the pattern: */ > ! var = vect_recog_temp_ssa_var (itype, NULL); > ! pattern_stmt = gimple_build_assign (var, WIDEN_MULT_EXPR, oprnd0, oprnd1); > ! > ! stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); > ! STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) = NULL; > ! > ! /* If the original two operands have different sizes, we may need to > convert > ! the smaller one into the larget type. If this is the case, at this > point > ! the new stmt is already built. */ > ! if (new_stmt) > ! { > ! append_pattern_def_seq (stmt_vinfo, new_stmt); > ! stmt_vec_info new_stmt_info > ! = new_stmt_vec_info (new_stmt, stmt_vinfo->vinfo); > ! set_vinfo_for_stmt (new_stmt, new_stmt_info); > ! STMT_VINFO_VECTYPE (new_stmt_info) = vectype; > ! } > ! > ! /* If ITYPE is not TYPE, we need to build a type convertion stmt to > convert > ! the result of the widen-mult operation into type TYPE. */ > ! if (itype != type) > ! { > ! append_pattern_def_seq (stmt_vinfo, pattern_stmt); > ! stmt_vec_info pattern_stmt_info > ! = new_stmt_vec_info (pattern_stmt, stmt_vinfo->vinfo); > ! set_vinfo_for_stmt (pattern_stmt, pattern_stmt_info); > ! STMT_VINFO_VECTYPE (pattern_stmt_info) = vecitype; > ! pattern_stmt = gimple_build_assign (vect_recog_temp_ssa_var (type, > NULL), > ! NOP_EXPR, > ! gimple_assign_lhs (pattern_stmt)); > ! } > > stmts->safe_push (last_stmt); > ! return pattern_stmt; > } > > > /* Function vect_recog_pow_pattern > > --- 1076,1103 ---- > if (!*type_out) > return NULL; > > ! STMT_VINFO_PATTERN_DEF_SEQ (last_stmt_info) = NULL; > ! tree oprnd[2]; > ! vect_convert_inputs (last_stmt_info, 2, oprnd, half_type, unprom, > vectype); > ! > ! tree var = vect_recog_temp_ssa_var (itype, NULL); > ! gimple *pattern_stmt = gimple_build_assign (var, wide_code, > ! oprnd[0], oprnd[1]); > > stmts->safe_push (last_stmt); > ! return vect_convert_output (last_stmt_info, type, pattern_stmt, vecitype); > } > > + /* Try to detect multiplication on widened inputs, converting MULT_EXPR > + to WIDEN_MULT_EXPR. See vect_recog_widen_op_pattern for details. */ > + > + static gimple * > + vect_recog_widen_mult_pattern (vec<gimple *> *stmts, tree *type_out) > + { > + return vect_recog_widen_op_pattern (stmts, type_out, MULT_EXPR, > + WIDEN_MULT_EXPR, false, > + "vect_recog_widen_mult_pattern"); > + } > > /* Function vect_recog_pow_pattern > > *************** vect_recog_pow_pattern (vec<gimple *> *s > *** 1147,1159 **** > static gimple * > vect_recog_widen_sum_pattern (vec<gimple *> *stmts, tree *type_out) > { > ! gimple *stmt, *last_stmt = (*stmts)[0]; > tree oprnd0, oprnd1; > stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); > ! tree type, half_type; > gimple *pattern_stmt; > tree var; > - bool promotion; > > /* Look for the following pattern > DX = (TYPE) X; > --- 1290,1302 ---- > static gimple * > vect_recog_widen_sum_pattern (vec<gimple *> *stmts, tree *type_out) > { > ! gimple *last_stmt = (*stmts)[0]; > tree oprnd0, oprnd1; > stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); > ! vec_info *vinfo = stmt_vinfo->vinfo; > ! tree type; > gimple *pattern_stmt; > tree var; > > /* Look for the following pattern > DX = (TYPE) X; > *************** vect_recog_widen_sum_pattern (vec<gimple > *** 1177,1197 **** > Left to check that oprnd0 is defined by a cast from type 'type' to type > 'TYPE'. */ > > ! if (!type_conversion_p (oprnd0, last_stmt, true, &half_type, &stmt, > ! &promotion) > ! || !promotion) > return NULL; > > - oprnd0 = gimple_assign_rhs1 (stmt); > - > vect_pattern_detected ("vect_recog_widen_sum_pattern", last_stmt); > > ! if (!vect_supportable_direct_optab_p (type, WIDEN_SUM_EXPR, half_type, > type_out)) > return NULL; > > var = vect_recog_temp_ssa_var (type, NULL); > ! pattern_stmt = gimple_build_assign (var, WIDEN_SUM_EXPR, oprnd0, oprnd1); > > return pattern_stmt; > } > --- 1320,1338 ---- > Left to check that oprnd0 is defined by a cast from type 'type' to type > 'TYPE'. */ > > ! vect_unpromoted_value unprom0; > ! if (!vect_look_through_possible_promotion (vinfo, oprnd0, &unprom0) > ! || TYPE_PRECISION (unprom0.type) * 2 > TYPE_PRECISION (type)) > return NULL; > > vect_pattern_detected ("vect_recog_widen_sum_pattern", last_stmt); > > ! if (!vect_supportable_direct_optab_p (type, WIDEN_SUM_EXPR, unprom0.type, > type_out)) > return NULL; > > var = vect_recog_temp_ssa_var (type, NULL); > ! pattern_stmt = gimple_build_assign (var, WIDEN_SUM_EXPR, unprom0.op, > oprnd1); > > return pattern_stmt; > } > *************** vect_recog_over_widening_pattern (vec<gi > *** 1556,1714 **** > return pattern_stmt; > } > > ! /* Detect widening shift pattern: > ! > ! type a_t; > ! TYPE a_T, res_T; > ! > ! S1 a_t = ; > ! S2 a_T = (TYPE) a_t; > ! S3 res_T = a_T << CONST; > ! > ! where type 'TYPE' is at least double the size of type 'type'. > ! > ! Also detect cases where the shift result is immediately converted > ! to another type 'result_type' that is no larger in size than 'TYPE'. > ! In those cases we perform a widen-shift that directly results in > ! 'result_type', to avoid a possible over-widening situation: > ! > ! type a_t; > ! TYPE a_T, res_T; > ! result_type res_result; > ! > ! S1 a_t = ; > ! S2 a_T = (TYPE) a_t; > ! S3 res_T = a_T << CONST; > ! S4 res_result = (result_type) res_T; > ! '--> res_result' = a_t w<< CONST; > ! > ! And a case when 'TYPE' is 4 times bigger than 'type'. In that case we > ! create an additional pattern stmt for S2 to create a variable of an > ! intermediate type, and perform widen-shift on the intermediate type: > ! > ! type a_t; > ! interm_type a_it; > ! TYPE a_T, res_T, res_T'; > ! > ! S1 a_t = ; > ! S2 a_T = (TYPE) a_t; > ! '--> a_it = (interm_type) a_t; > ! S3 res_T = a_T << CONST; > ! '--> res_T' = a_it <<* CONST; > ! > ! Input/Output: > ! > ! * STMTS: Contains a stmt from which the pattern search begins. > ! In case of unsigned widen-shift, the original stmt (S3) is replaced > with S4 > ! in STMTS. When an intermediate type is used and a pattern statement is > ! created for S2, we also put S2 here (before S3). > ! > ! Output: > ! > ! * TYPE_OUT: The type of the output of this pattern. > ! > ! * Return value: A new stmt that will be used to replace the sequence of > ! stmts that constitute the pattern. In this case it will be: > ! WIDEN_LSHIFT_EXPR <a_t, CONST>. */ > > static gimple * > vect_recog_widen_shift_pattern (vec<gimple *> *stmts, tree *type_out) > { > ! gimple *last_stmt = stmts->pop (); > ! gimple *def_stmt0; > ! tree oprnd0, oprnd1; > ! tree type, half_type0; > ! gimple *pattern_stmt; > ! tree vectype, vectype_out = NULL_TREE; > ! tree var; > ! enum tree_code dummy_code; > ! int dummy_int; > ! vec<tree> dummy_vec; > ! gimple *use_stmt; > ! bool promotion; > ! > ! if (!is_gimple_assign (last_stmt) || !vinfo_for_stmt (last_stmt)) > ! return NULL; > ! > ! if (gimple_assign_rhs_code (last_stmt) != LSHIFT_EXPR) > ! return NULL; > ! > ! oprnd0 = gimple_assign_rhs1 (last_stmt); > ! oprnd1 = gimple_assign_rhs2 (last_stmt); > ! if (TREE_CODE (oprnd0) != SSA_NAME || TREE_CODE (oprnd1) != INTEGER_CST) > ! return NULL; > ! > ! /* Check operand 0: it has to be defined by a type promotion. */ > ! if (!type_conversion_p (oprnd0, last_stmt, false, &half_type0, &def_stmt0, > ! &promotion) > ! || !promotion) > ! return NULL; > ! > ! /* Check operand 1: has to be positive. We check that it fits the type > ! in vect_handle_widen_op_by_const (). */ > ! if (tree_int_cst_compare (oprnd1, size_zero_node) <= 0) > ! return NULL; > ! > ! oprnd0 = gimple_assign_rhs1 (def_stmt0); > ! type = gimple_expr_type (last_stmt); > ! > ! /* Check for subsequent conversion to another type. */ > ! use_stmt = vect_single_imm_use (last_stmt); > ! if (use_stmt && is_gimple_assign (use_stmt) > ! && CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (use_stmt)) > ! && !STMT_VINFO_IN_PATTERN_P (vinfo_for_stmt (use_stmt))) > ! { > ! tree use_lhs = gimple_assign_lhs (use_stmt); > ! tree use_type = TREE_TYPE (use_lhs); > ! > ! if (INTEGRAL_TYPE_P (use_type) > ! && TYPE_PRECISION (use_type) <= TYPE_PRECISION (type)) > ! { > ! last_stmt = use_stmt; > ! type = use_type; > ! } > ! } > ! > ! /* Check if this a widening operation. */ > ! gimple *wstmt = NULL; > ! if (!vect_handle_widen_op_by_const (last_stmt, LSHIFT_EXPR, oprnd1, > ! &oprnd0, &wstmt, > ! type, &half_type0, def_stmt0)) > ! return NULL; > ! > ! /* Pattern detected. */ > ! vect_pattern_detected ("vect_recog_widen_shift_pattern", last_stmt); > ! > ! /* Check target support. */ > ! vectype = get_vectype_for_scalar_type (half_type0); > ! vectype_out = get_vectype_for_scalar_type (type); > ! > ! if (!vectype > ! || !vectype_out > ! || !supportable_widening_operation (WIDEN_LSHIFT_EXPR, last_stmt, > ! vectype_out, vectype, > ! &dummy_code, &dummy_code, > ! &dummy_int, &dummy_vec)) > ! return NULL; > ! > ! *type_out = vectype_out; > ! > ! /* Pattern supported. Create a stmt to be used to replace the pattern. > */ > ! var = vect_recog_temp_ssa_var (type, NULL); > ! pattern_stmt > ! = gimple_build_assign (var, WIDEN_LSHIFT_EXPR, oprnd0, oprnd1); > ! if (wstmt) > ! { > ! stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); > ! new_pattern_def_seq (stmt_vinfo, wstmt); > ! stmt_vec_info new_stmt_info > ! = new_stmt_vec_info (wstmt, stmt_vinfo->vinfo); > ! set_vinfo_for_stmt (wstmt, new_stmt_info); > ! STMT_VINFO_VECTYPE (new_stmt_info) = vectype; > ! } > ! > ! stmts->safe_push (last_stmt); > ! return pattern_stmt; > } > > /* Detect a rotate pattern wouldn't be otherwise vectorized: > --- 1697,1711 ---- > return pattern_stmt; > } > > ! /* Try to detect a shift left of a widened input, converting LSHIFT_EXPR > ! to WIDEN_LSHIFT_EXPR. See vect_recog_widen_op_pattern for details. */ > > static gimple * > vect_recog_widen_shift_pattern (vec<gimple *> *stmts, tree *type_out) > { > ! return vect_recog_widen_op_pattern (stmts, type_out, LSHIFT_EXPR, > ! WIDEN_LSHIFT_EXPR, true, > ! "vect_widen_shift_pattern"); > } > > /* Detect a rotate pattern wouldn't be otherwise vectorized: > Index: gcc/testsuite/gcc.dg/vect/vect-widen-mult-sum.c > =================================================================== > *** gcc/testsuite/gcc.dg/vect/vect-widen-mult-sum.c 2018-06-29 > 10:15:12.000000000 +0100 > --- gcc/testsuite/gcc.dg/vect/vect-widen-mult-sum.c 2018-06-29 > 10:15:12.553142197 +0100 > *************** int main (void) > *** 43,46 **** > > > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target > vect_widen_mult_hi_to_si } } } */ > ! /* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: > detected" 1 "vect" } } */ > --- 43,46 ---- > > > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target > vect_widen_mult_hi_to_si } } } */ > ! /* { dg-final { scan-tree-dump "vect_recog_widen_mult_pattern: detected" > "vect" } } */ > Index: gcc/testsuite/gcc.dg/vect/no-scevccp-outer-6.c > =================================================================== > *** gcc/testsuite/gcc.dg/vect/no-scevccp-outer-6.c 2018-06-29 > 10:15:12.000000000 +0100 > --- gcc/testsuite/gcc.dg/vect/no-scevccp-outer-6.c 2018-06-29 > 10:15:12.553142197 +0100 > *************** int main (void) > *** 52,55 **** > } > > /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { > xfail { unaligned_stack || { vect_no_align && { ! vect_hw_misalign } } } } } > } */ > ! /* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: > detected" 1 "vect" { xfail *-*-* } } } */ > --- 52,56 ---- > } > > /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { > xfail { unaligned_stack || { vect_no_align && { ! vect_hw_misalign } } } } } > } */ > ! /* Exclude widening of i * 4 in the calculation of a[i]. */ > ! /* { dg-final { scan-tree-dump-times {vect_recog_widen_mult_pattern: > detected: [^\n]* \* ([^4]|4[^;])} 1 "vect" { xfail *-*-* } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-mult-const-pattern-1.c > =================================================================== > *** gcc/testsuite/gcc.dg/vect/vect-mult-const-pattern-1.c 2018-06-29 > 10:15:12.000000000 +0100 > --- gcc/testsuite/gcc.dg/vect/vect-mult-const-pattern-1.c 2018-06-29 > 10:15:12.553142197 +0100 > *************** main (void) > *** 37,41 **** > return 0; > } > > ! /* { dg-final { scan-tree-dump-times "vect_recog_mult_pattern: detected" 2 > "vect" { target aarch64*-*-* xfail aarch64_sve } } } */ > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { > target aarch64*-*-* } } } */ > --- 37,41 ---- > return 0; > } > > ! /* { dg-final { scan-tree-dump {vect_recog_mult_pattern: detected:[^\n]* \* > 123} "vect" { target aarch64*-*-* xfail aarch64_sve } } } */ > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { > target aarch64*-*-* } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-mult-const-pattern-2.c > =================================================================== > *** gcc/testsuite/gcc.dg/vect/vect-mult-const-pattern-2.c 2018-06-29 > 10:15:12.000000000 +0100 > --- gcc/testsuite/gcc.dg/vect/vect-mult-const-pattern-2.c 2018-06-29 > 10:15:12.553142197 +0100 > *************** main (void) > *** 36,40 **** > return 0; > } > > ! /* { dg-final { scan-tree-dump-times "vect_recog_mult_pattern: detected" 2 > "vect" { target aarch64*-*-* xfail aarch64_sve } } } */ > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { > target aarch64*-*-* } } } */ > --- 36,40 ---- > return 0; > } > > ! /* { dg-final { scan-tree-dump {vect_recog_mult_pattern: detected:[^\n]* \* > -19594} "vect" { target aarch64*-*-* xfail aarch64_sve } } } */ > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { > target aarch64*-*-* } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-widen-mult-const-s16.c > =================================================================== > *** gcc/testsuite/gcc.dg/vect/vect-widen-mult-const-s16.c 2018-06-29 > 10:15:12.000000000 +0100 > --- gcc/testsuite/gcc.dg/vect/vect-widen-mult-const-s16.c 2018-06-29 > 10:15:12.553142197 +0100 > *************** int main (void) > *** 56,61 **** > } > > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target > vect_widen_mult_hi_to_si } } } */ > ! /* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: > detected" 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } } */ > ! /* { dg-final { scan-tree-dump-times "widen_mult pattern recognized" 2 > "vect" { target vect_widen_mult_hi_to_si_pattern } } } */ > > --- 56,61 ---- > } > > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target > vect_widen_mult_hi_to_si } } } */ > ! /* { dg-final { scan-tree-dump-times {vect_recog_widen_mult_pattern: > detected:[^\n]* 2333} 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } > } */ > ! /* { dg-final { scan-tree-dump-times {widen_mult pattern recognized:[^\n]* > 2333} 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } } */ > > Index: gcc/testsuite/gcc.dg/vect/vect-widen-mult-const-u16.c > =================================================================== > *** gcc/testsuite/gcc.dg/vect/vect-widen-mult-const-u16.c 2018-06-29 > 10:15:12.000000000 +0100 > --- gcc/testsuite/gcc.dg/vect/vect-widen-mult-const-u16.c 2018-06-29 > 10:15:12.553142197 +0100 > *************** int main (void) > *** 73,77 **** > } > > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" { target > vect_widen_mult_hi_to_si } } } */ > ! /* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: > detected" 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } } */ > ! /* { dg-final { scan-tree-dump-times "widen_mult pattern recognized" 2 > "vect" { target vect_widen_mult_hi_to_si_pattern } } } */ > --- 73,77 ---- > } > > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" { target > vect_widen_mult_hi_to_si } } } */ > ! /* { dg-final { scan-tree-dump-times {vect_recog_widen_mult_pattern: > detected:[^\n]* 2333} 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } > } */ > ! /* { dg-final { scan-tree-dump-times {widen_mult pattern recognized:[^\n]* > = \(int\)} 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-reduc-dot-1.c > =================================================================== > *** /dev/null 2018-06-13 14:36:57.192460992 +0100 > --- gcc/testsuite/gcc.dg/vect/vect-reduc-dot-1.c 2018-06-29 > 10:15:12.553142197 +0100 > *************** > *** 0 **** > --- 1,49 ---- > + /* { dg-require-effective-target vect_int } */ > + > + #include "tree-vect.h" > + > + #define N 50 > + > + #ifndef SIGNEDNESS_1 > + #define SIGNEDNESS_1 unsigned > + #define