Hi Jakub, Thank you for reviewing the patch.
I have incorporated your comments in the attached patch. > -----Original Message----- > From: Jakub Jelinek [mailto:ja...@redhat.com] > Sent: Wednesday, July 29, 2015 1:24 AM > To: Kumar, Venkataramanan > Cc: Richard Beiner (richard.guent...@gmail.com); gcc-patches@gcc.gnu.org > Subject: Re: [RFC] [Patch]: Try and vectorize with shift for mult expr with > power 2 integer constant. > > Hi! > > > This is just an initial patch and tries to optimize integer type power > > 2 constants. I wanted to get feedback on this . I bootstrapped and > > reg tested on aarch64-none-linux-gnu . > > Thanks for working on it. > ChangeLog entry for the patch is missing, probably also some testcases. > > > @@ -90,6 +94,7 @@ static vect_recog_func_ptr > vect_vect_recog_func_ptrs[NUM_PATTERNS] = { > > vect_recog_rotate_pattern, > > vect_recog_vector_vector_shift_pattern, > > vect_recog_divmod_pattern, > > + vect_recog_multconst_pattern, > > vect_recog_mixed_size_cond_pattern, > > vect_recog_bool_pattern}; > > Please watch formatting, the other lines are tab indented, so please use a > tab rather than 8 spaces. > > > @@ -2147,6 +2152,87 @@ vect_recog_vector_vector_shift_pattern > (vec<gimple> *stmts, > > return pattern_stmt; > > } > > > > Function comment is missing here. > > > +static gimple > > +vect_recog_multconst_pattern (vec<gimple> *stmts, > > + tree *type_in, tree *type_out) > > About the function name, wonder if just vect_recog_mult_pattern wouldn't > be enough. > > > + rhs_code = gimple_assign_rhs_code (last_stmt); switch (rhs_code) > > + { > > + case MULT_EXPR: > > + break; > > + default: > > + return NULL; > > + } > > This looks too weird, I'd just do > if (gimple_assign_rhs_code (last_stmt) != MULT_EXPR) > return NULL; > (you handle just one pattern). > > > + /* If the target can handle vectorized multiplication natively, > > + don't attempt to optimize this. */ optab = optab_for_tree_code > > + (rhs_code, vectype, optab_default); > > Supposedly you can use MULT_EXPR directly here. > > > + /* If target cannot handle vector left shift then we cannot > > + optimize and bail out. */ > > + optab = optab_for_tree_code (LSHIFT_EXPR, vectype, optab_vector); > > + if (!optab > > + || optab_handler (optab, TYPE_MODE (vectype)) == > CODE_FOR_nothing) > > + return NULL; > > + > > + if (integer_pow2p (oprnd1)) > > + { > > + /* Pattern detected. */ > > + if (dump_enabled_p ()) > > + dump_printf_loc (MSG_NOTE, vect_location, > > + "vect_recog_multconst_pattern: detected:\n"); > > + > > + tree shift; > > + shift = build_int_cst (itype, tree_log2 (oprnd1)); > > + pattern_stmt = gimple_build_assign (vect_recog_temp_ssa_var > (itype, NULL), > > + LSHIFT_EXPR, oprnd0, shift); > > + if (dump_enabled_p ()) > > + dump_gimple_stmt_loc (MSG_NOTE, vect_location, TDF_SLIM, > pattern_stmt, > > + 0); > > + stmts->safe_push (last_stmt); > > + *type_in = vectype; > > + *type_out = vectype; > > + return pattern_stmt; > > + } > > Trailing whitespace. > The integer_pow2p case (have you checked signed multiply by INT_MIN?) is > only one of the cases you can actually handle, you can look at expand_mult > for many other cases - e.g. multiplication by negated powers of 2, or call > choose_mult_variant and handle whatever it returns. I have added patterns that detect mults with power of 2 constants and convert them to shifts in this patch. I will do a follow up patch for other variants as done in "expand_mult" and "choose_mult_variant". INT_MIN case, I have not yet checked. I thought of adding like this. if (wi::eq_p (oprnd1, HOST_WIDE_INT_MIN)) return NULL; But wanted to confirm with you since I don't understand this clearly. Is this because Var * INT_MIN != Var << 31 ? I tried this case for "int" or long int type arr void test_vector_shifts() { for(i=0; i<=99;i++) arr[i]=arr[i]*INT_MIN; } Int type- Looks like without my patch aarch64 vectorizes it using shifts. ldr q0, [x0] shl v0.4s, v0.4s, 31 str q0, [x0], 16 For long int type - without my patch it converts to shifts but vectorization did not happen. ldr x1, [x0] neg x1, x1, lsl 31 str x1, [x0], 8 > > Jakub gcc/ChangeLog 2015-08-02 Venkataramanan Kumar <venkataramanan.ku...@amd.com> * tree-vect-patterns.c (vect_recog_mult_pattern): New function for vectorizing multiplication patterns. * tree-vectorizer.h: Adjust the number of patterns. gcc/testsuite/ChangeLog 2015-08-02 Venkataramanan Kumar <venkataramanan.ku...@amd.com> * gcc.dg/vect/vect-mult-patterns.c: New Regards, Venkat.
diff --git a/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c b/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c new file mode 100644 index 0000000..ad1760c --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ +/* { dg-require-effective-target vect_shift } */ + +unsigned long int __attribute__ ((aligned (64)))arr[100]; +int i; + +void test_vector_shifts () +{ + for (i=0; i<=99; i++) + arr[i] = arr[i] * 4; +} + +void test_vectorshift_via_mul () +{ + for (i=0; i<=99; i++) + arr[i] = arr[i] * (-4); +} + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" {target { ! { vect_int_mult } } } } } */ +/* { dg-final { scan-tree-dump-times "vect_recog_mult_pattern: detected" 2 "vect" {target { ! { vect_int_mult } } } } } */ diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c index f034635..5cbb49e 100644 --- a/gcc/tree-vect-patterns.c +++ b/gcc/tree-vect-patterns.c @@ -76,6 +76,10 @@ static gimple vect_recog_vector_vector_shift_pattern (vec<gimple> *, tree *, tree *); static gimple vect_recog_divmod_pattern (vec<gimple> *, tree *, tree *); + +static gimple vect_recog_mult_pattern (vec<gimple> *, + tree *, tree *); + static gimple vect_recog_mixed_size_cond_pattern (vec<gimple> *, tree *, tree *); static gimple vect_recog_bool_pattern (vec<gimple> *, tree *, tree *); @@ -90,6 +94,7 @@ static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] = { vect_recog_rotate_pattern, vect_recog_vector_vector_shift_pattern, vect_recog_divmod_pattern, + vect_recog_mult_pattern, vect_recog_mixed_size_cond_pattern, vect_recog_bool_pattern}; @@ -2147,6 +2152,140 @@ vect_recog_vector_vector_shift_pattern (vec<gimple> *stmts, return pattern_stmt; } +/* Detect multiplication by constant which are postive or negatives of power 2, + and convert them to shift patterns. + + Mult with constants that are postive power of two. + type a_t; + type b_t + S1: b_t = a_t * n + + or + + Mult with constants that are negative power of two. + S2: b_t = a_t * -n + + Input/Output: + + STMTS: Contains a stmt from which the pattern search begins, + i.e. the mult stmt. Convert the mult operation to LSHIFT if + constant operand is a power of 2. + type a_t, b_t + S1': b_t = a_t << log2 (n) + + Convert the mult operation to LSHIFT and followed by a NEGATE + if constant operand is a negative power of 2. + type a_t, b_t, res_T; + S2': b_t = a_t << log2 (n) + S3': res_T = - (b_t) + + Output: + + * TYPE_IN: The type of the input arguments to the pattern. + + * TYPE_OUT: The type of the output of this pattern. + + * Return value: A new stmt that will be used to replace the multiplication + S1 or S2 stmt. */ + +static gimple +vect_recog_mult_pattern (vec<gimple> *stmts, + tree *type_in, tree *type_out) +{ + gimple last_stmt = stmts->pop (); + tree oprnd0, oprnd1, vectype, itype; + gimple pattern_stmt, def_stmt; + optab optab; + stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); + + if (!is_gimple_assign (last_stmt)) + return NULL; + + if (gimple_assign_rhs_code (last_stmt) != MULT_EXPR) + return NULL; + + oprnd0 = gimple_assign_rhs1 (last_stmt); + oprnd1 = gimple_assign_rhs2 (last_stmt); + itype = TREE_TYPE (oprnd0); + + if (TREE_CODE (oprnd0) != SSA_NAME + || TREE_CODE (oprnd1) != INTEGER_CST + || TREE_CODE (itype) != INTEGER_TYPE + || TYPE_PRECISION (itype) != GET_MODE_PRECISION (TYPE_MODE (itype))) + return NULL; + + vectype = get_vectype_for_scalar_type (itype); + if (vectype == NULL_TREE) + return NULL; + + /* If the target can handle vectorized multiplication natively, + don't attempt to optimize this. */ + optab = optab_for_tree_code (MULT_EXPR, vectype, optab_default); + if (optab != unknown_optab) + { + machine_mode vec_mode = TYPE_MODE (vectype); + int icode = (int) optab_handler (optab, vec_mode); + if (icode != CODE_FOR_nothing) + return NULL; + } + + /* If target cannot handle vector left shift then we cannot + optimize and bail out. */ + optab = optab_for_tree_code (LSHIFT_EXPR, vectype, optab_vector); + if (!optab + || optab_handler (optab, TYPE_MODE (vectype)) == CODE_FOR_nothing) + return NULL; + + /* Handle constant operands that are postive or negative powers of 2. */ + if ( wi::exact_log2 (oprnd1) != -1 || + wi::exact_log2 (wi::neg (oprnd1)) != -1) + { + tree shift; + + if (wi::exact_log2 (oprnd1) != -1) + { + shift = build_int_cst (itype, wi::exact_log2 (oprnd1)); + pattern_stmt + = gimple_build_assign (vect_recog_temp_ssa_var (itype, NULL), + LSHIFT_EXPR, oprnd0, shift); + } + else + { + /* If the target cannot handle vector NEGATE then we cannot + do the optimization. */ + optab = optab_for_tree_code (NEGATE_EXPR, vectype, optab_vector); + if (!optab + || optab_handler (optab, TYPE_MODE (vectype)) == CODE_FOR_nothing) + return NULL; + + shift = build_int_cst (itype, wi::exact_log2 (wi::neg (oprnd1))); + def_stmt + = gimple_build_assign (vect_recog_temp_ssa_var (itype, NULL), + LSHIFT_EXPR, oprnd0, shift); + new_pattern_def_seq (stmt_vinfo, def_stmt); + pattern_stmt + = gimple_build_assign (vect_recog_temp_ssa_var (itype, NULL), + NEGATE_EXPR, gimple_assign_lhs (def_stmt)); + } + + /* Pattern detected. */ + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_recog_mult_pattern: detected:\n"); + + if (dump_enabled_p ()) + dump_gimple_stmt_loc (MSG_NOTE, vect_location, TDF_SLIM, + pattern_stmt,0); + + stmts->safe_push (last_stmt); + *type_in = vectype; + *type_out = vectype; + return pattern_stmt; + } + + return NULL; +} + /* Detect a signed division by a constant that wouldn't be otherwise vectorized: diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index dfa8795..b490af4 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -1132,7 +1132,7 @@ extern void vect_slp_transform_bb (basic_block); Additional pattern recognition functions can (and will) be added in the future. */ typedef gimple (* vect_recog_func_ptr) (vec<gimple> *, tree *, tree *); -#define NUM_PATTERNS 12 +#define NUM_PATTERNS 13 void vect_pattern_recog (loop_vec_info, bb_vec_info); /* In tree-vectorizer.c. */