On Fri, Jan 22, 2016 at 3:29 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: > Richard, > > I fixed all remarks pointed by you in vectorizer part of patch. Could > you take a look on modified patch.
+ if (is_gimple_call (stmt1)) + lhs = gimple_call_lhs (stmt1); + else + if (gimple_code (stmt1) == GIMPLE_ASSIGN) + lhs = gimple_assign_lhs (stmt1); + else + break; use lhs = gimple_get_lhs (stmt1); if (!lhs) break; you forget to free bbs, I suggest to do it after seeding the worklist. Ok with those changes if the backend changes are approved. Sorry for the delay in reviewing this. Thanks, Richard. > Uros, > > Could you please review i386 part of patch related to support of > conditional branches with vector comparison. > > Bootstrap and regression testing did not show any new failures. > Is it OK for trunk? > > Thanks. > Yuri. > > ChangeLog: > > 2016-01-22 Yuri Rumyantsev <ysrum...@gmail.com> > > PR middle-end/68542 > * config/i386/i386.c (ix86_expand_branch): Add support for conditional > brnach with vector comparison. > * config/i386/sse.md (define_expand "cbranch<mode>4): Add define-expand > for vector comparion with eq/ne only. > (optimize_mask_stores): New function. > * tree-vect-stmts.c (vectorizable_mask_load_store): Initialize > has_mask_store field of vect_info. > * tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for > vectorized loops having masked stores after vec_info destroy. > * tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and > correspondent macros. > (optimize_mask_stores): Add prototype. > > gcc/testsuite/ChangeLog: > * gcc.dg/vect/vect-mask-store-move-1.c: New test. > * testsuite/gcc.target/i386/avx2-vect-mask-store-move1.c: New test. > > 2016-01-20 15:24 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >> On Mon, Jan 18, 2016 at 3:50 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >>> Richard, >>> >>> Here is the second part of patch which really preforms mask stores and >>> all statements related to it to new basic block guarded by test on >>> zero mask. Hew test is also added. >>> >>> Is it OK for trunk? >> >> + /* Pick up all masked stores in loop if any. */ >> + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) >> + { >> + stmt = gsi_stmt (gsi); >> >> you fail to iterate over all BBs of the loop here. Please follow >> other uses in the >> vectorizer. >> >> + while (!worklist.is_empty ()) >> + { >> + gimple *last, *last_store; >> + edge e, efalse; >> + tree mask; >> + basic_block store_bb, join_bb; >> + gimple_stmt_iterator gsi_to; >> + /* tree arg3; */ >> >> remove >> >> + tree vdef, new_vdef; >> + gphi *phi; >> + bool first_dump; >> + tree vectype; >> + tree zero; >> + >> + last = worklist.pop (); >> + mask = gimple_call_arg (last, 2); >> + /* Create new bb. */ >> >> bb should be initialized from gimple_bb (last), not loop->header >> >> + e = split_block (bb, last); >> >> + gsi_from = gsi_for_stmt (stmt1); >> + gsi_to = gsi_start_bb (store_bb); >> + gsi_move_before (&gsi_from, &gsi_to); >> + update_stmt (stmt1); >> >> I think the update_stmt is redundant and you should be able to >> keep two gsis throughout the loop, from and to, no? >> >> + /* Put other masked stores with the same mask to STORE_BB. */ >> + if (worklist.is_empty () >> + || gimple_call_arg (worklist.last (), 2) != mask >> + || !is_valid_sink (worklist.last (), last_store)) >> >> as I understand the code the last check is redundant with the invariant >> you track if you verify the stmt you breaked from the inner loop is >> actually equal to worklist.last () and you add a flag to track whether >> you did visit a load (vuse) in the sinking loop you didn't end up sinking. >> >> + /* Issue different messages depending on FIRST_DUMP. */ >> + if (first_dump) >> + { >> + dump_printf_loc (MSG_NOTE, vect_location, >> + "Move MASK_STORE to new bb#%d\n", >> + store_bb->index); >> + first_dump = false; >> + } >> + else >> + dump_printf_loc (MSG_NOTE, vect_location, >> + "Move MASK_STORE to created bb\n"); >> >> just add a separate dump when you create the BB, "Created new bb#%d for ..." >> to avoid this. >> >> Note that I can't comment on the x86 backend part so that will need to >> be reviewed by somebody >> else. >> >> Thanks, >> Richard. >> >>> Thanks. >>> Yuri. >>> >>> 2016-01-18 Yuri Rumyantsev <ysrum...@gmail.com> >>> >>> PR middle-end/68542 >>> * config/i386/i386.c (ix86_expand_branch): Implement integral vector >>> comparison with boolean result. >>> * config/i386/sse.md (define_expand "cbranch<mode>4): Add define-expand >>> for vector comparion with eq/ne only. >>> * tree-vect-loop.c (is_valid_sink): New function. >>> (optimize_mask_stores): Likewise. >>> * tree-vect-stmts.c (vectorizable_mask_load_store): Initialize >>> has_mask_store field of vect_info. >>> * tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for >>> vectorized loops having masked stores. >>> * tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and >>> correspondent macros. >>> (optimize_mask_stores): Add prototype. >>> >>> gcc/testsuite/ChangeLog: >>> * gcc.dg/vect/vect-mask-store-move-1.c: New test. >>> >>> 2016-01-18 17:07 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>> On Mon, Jan 18, 2016 at 3:02 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>> wrote: >>>>> Thanks Richard. >>>>> >>>>> I changed the check on type as you proposed. >>>>> >>>>> What about the second back-end part of patch (it has been sent 08.12.15). >>>> >>>> Can't see it in my inbox - can you reply to the mail with a ping? >>>> >>>> Thanks, >>>> Richard. >>>> >>>>> Thanks. >>>>> Yuri. >>>>> >>>>> 2016-01-18 15:44 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>>>> On Mon, Jan 11, 2016 at 11:06 AM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>> wrote: >>>>>>> Hi Richard, >>>>>>> >>>>>>> Did you have anu chance to look at updated patch? >>>>>> >>>>>> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c >>>>>> index acbb70b..208a752 100644 >>>>>> --- a/gcc/tree-vrp.c >>>>>> +++ b/gcc/tree-vrp.c >>>>>> @@ -5771,6 +5771,10 @@ register_edge_assert_for (tree name, edge e, >>>>>> gimple_stmt_iterator si, >>>>>> &comp_code, &val)) >>>>>> return; >>>>>> >>>>>> + /* VRP doesn't track ranges for vector types. */ >>>>>> + if (TREE_CODE (TREE_TYPE (name)) == VECTOR_TYPE) >>>>>> + return; >>>>>> + >>>>>> >>>>>> please instead fix extract_code_and_val_from_cond_with_ops with >>>>>> >>>>>> Index: gcc/tree-vrp.c >>>>>> =================================================================== >>>>>> --- gcc/tree-vrp.c (revision 232506) >>>>>> +++ gcc/tree-vrp.c (working copy) >>>>>> @@ -5067,8 +5067,9 @@ extract_code_and_val_from_cond_with_ops >>>>>> if (invert) >>>>>> comp_code = invert_tree_comparison (comp_code, 0); >>>>>> >>>>>> - /* VRP does not handle float types. */ >>>>>> - if (SCALAR_FLOAT_TYPE_P (TREE_TYPE (val))) >>>>>> + /* VRP only handles integral and pointer types. */ >>>>>> + if (! INTEGRAL_TYPE_P (TREE_TYPE (val)) >>>>>> + && ! POINTER_TYPE_P (TREE_TYPE (val))) >>>>>> return false; >>>>>> >>>>>> /* Do not register always-false predicates. >>>>>> >>>>>> Ok with that change. >>>>>> >>>>>> Thanks, >>>>>> Richard. >>>>>> >>>>>>> Thanks. >>>>>>> Yuri. >>>>>>> >>>>>>> 2015-12-18 13:20 GMT+03:00 Yuri Rumyantsev <ysrum...@gmail.com>: >>>>>>>> Hi Richard, >>>>>>>> >>>>>>>> Here is updated patch for middle-end part of the whole patch which >>>>>>>> fixes all your remarks I hope. >>>>>>>> >>>>>>>> Regression testing and bootstrapping did not show any new failures. >>>>>>>> Is it OK for trunk? >>>>>>>> >>>>>>>> Yuri. >>>>>>>> >>>>>>>> ChangeLog: >>>>>>>> 2015-12-18 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>> >>>>>>>> PR middle-end/68542 >>>>>>>> * fold-const.c (fold_binary_op_with_conditional_arg): Bail out for case >>>>>>>> of mixind vector and scalar types. >>>>>>>> (fold_relational_const): Add handling of vector >>>>>>>> comparison with boolean result. >>>>>>>> * tree-cfg.c (verify_gimple_comparison): Add argument CODE, allow >>>>>>>> comparison of vector operands with boolean result for EQ/NE only. >>>>>>>> (verify_gimple_assign_binary): Adjust call for >>>>>>>> verify_gimple_comparison. >>>>>>>> (verify_gimple_cond): Likewise. >>>>>>>> * tree-ssa-forwprop.c (combine_cond_expr_cond): Do not perform