https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95199
--- Comment #7 from bin cheng <amker at gcc dot gnu.org> --- (In reply to rguent...@suse.de from comment #6) > On Thu, 21 May 2020, zhoukaipeng3 at huawei dot com wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95199 > > > > --- Comment #4 from Kaipeng Zhou <zhoukaipeng3 at huawei dot com> --- > > Sorry for not expressing clearly. > > > > I have debugged the testcase you provided. Not eliminating them is not > > caused > > by IFN. The relevant code is in the "get_computation_aff_1" function. > > > > In IVOPTs the IV_STEPs must be checked by function "constant_multiple_of" > > before using an IV variable to eliminate the other. But if the tree_code of > > input IV_STEP is SSA_NAME, the function will return false. In your > > testcase, > > the tree_code of IV_STEP is MULT_EXPR, so it return true. > > > > Gimple for my testcase: > > <bb 12> [local count: 8589933]: > > _83 = (sizetype) inc_y_22(D); > > _84 = _83 * POLY_INT_CST [16, 16]; > > _85 = (long unsigned int) inc_y_22(D); > > _86 = _85 * 8; > > _87 = (ssizetype) _86; > > _88 = _87 /[ex] 8; > > _89 = (long unsigned int) _88; > > _90 = VEC_SERIES_EXPR <0, _89>; > > vect_cst__95 = [vec_duplicate_expr] m_17(D); > > _97 = (sizetype) inc_x_20(D); > > _98 = _97 * POLY_INT_CST [16, 16]; > > _99 = (long unsigned int) inc_x_20(D); > > _100 = _99 * 8; > > _101 = (ssizetype) _100; > > _102 = _101 /[ex] 8; > > _103 = (long unsigned int) _102; > > _104 = VEC_SERIES_EXPR <0, _103>; > > _109 = (sizetype) inc_x_20(D); > > _110 = _109 * POLY_INT_CST [16, 16]; > > _111 = (long unsigned int) inc_x_20(D); > > The issue is you have two copies of > (sizetype) inc_x_20(D) * POLY_INT_CST [16, 16]; > and IVOPTs does not perform CSE. vinfo->ivexpr_map is supposed to > catch those "IV base and/or step expressions". So look where > they are inserted and check the CSE map is used. Alternatively > fixup hashing/comparing to handle POLY_INT_CST [16, 16] if that > is the reason for the missed CSE. > Yes, it's because cse_and_gimplify_to_preheader is not called for gathering/scattering. Should be easily fixed by following patch: diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index e7822c44951..ba9ee5c4996 100644 --- a/gcc/tree-vect-stmts.c +++ b/gcc/tree-vect-stmts.c @@ -2961,6 +2961,7 @@ vect_get_strided_load_store_ops (stmt_vec_info stmt_info, tree bump = size_binop (MULT_EXPR, fold_convert (sizetype, unshare_expr (DR_STEP (dr))), size_int (TYPE_VECTOR_SUBPARTS (vectype))); + bump = cse_and_gimplify_to_preheader (loop_vinfo, bump); *dataref_bump = force_gimple_operand (bump, &stmts, true, NULL_TREE); if (stmts) gsi_insert_seq_on_edge_immediate (loop_preheader_edge (loop), stmts);