On Wed, Jun 17, 2020 at 10:50 AM Richard Biener <richard.guent...@gmail.com> wrote: > > On Mon, Jun 15, 2020 at 2:20 PM Martin Liška <mli...@suse.cz> wrote: > > > > On 6/15/20 1:59 PM, Richard Biener wrote: > > > On Mon, Jun 15, 2020 at 1:19 PM Martin Liška <mli...@suse.cz> wrote: > > >> > > >> On 6/15/20 9:14 AM, Richard Biener wrote: > > >>> On Fri, Jun 12, 2020 at 3:24 PM Martin Liška <mli...@suse.cz> wrote: > > >>>> > > >>>> On 6/12/20 11:43 AM, Richard Biener wrote: > > >>>>> So ... how far are you with enforcing a split VEC_COND_EXPR? > > >>>>> Thus can we avoid the above completely (even as intermediate > > >>>>> state)? > > >>>> > > >>>> Apparently, I'm quite close. Using the attached patch I see only 2 > > >>>> testsuite > > >>>> failures: > > >>>> > > >>>> FAIL: gcc.dg/tree-ssa/pr68714.c scan-tree-dump-times reassoc1 " <= " 1 > > >>>> FAIL: gcc.target/i386/pr78102.c scan-assembler-times pcmpeqq 3 > > >>>> > > >>>> The first one is about teaching reassoc about the SSA_NAMEs in > > >>>> VEC_COND_EXPR. I haven't > > >>>> analyze the second failure. > > >>>> > > >>>> I'm also not sure about the gimlification change, I see a superfluous > > >>>> assignments: > > >>>> vec_cond_cmp.5 = _1 == _2; > > >>>> vec_cond_cmp.6 = vec_cond_cmp.5; > > >>>> vec_cond_cmp.7 = vec_cond_cmp.6; > > >>>> _3 = VEC_COND_EXPR <vec_cond_cmp.7, { -1, -1, -1, -1, -1, -1, -1, > > >>>> -1 }, { 0, 0, 0, 0, 0, 0, 0, 0 }>; > > >>>> ? > > >>>> > > >>>> So with the suggested patch, the EH should be gone as you suggested. > > >>>> Right? > > >>> > > >>> Right, it should be on the comparison already from the start. > > >>> > > >>> @@ -14221,9 +14221,13 @@ gimplify_expr (tree *expr_p, gimple_seq > > >>> *pre_p, gimple_seq *post_p, > > >>> case VEC_COND_EXPR: > > >>> { > > >>> enum gimplify_status r0, r1, r2; > > >>> - > > >>> r0 = gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p, > > >>> post_p, is_gimple_condexpr, > > >>> fb_rvalue); > > >>> + tree xop0 = TREE_OPERAND (*expr_p, 0); > > >>> + tmp = create_tmp_var_raw (TREE_TYPE (xop0), "vec_cond_cmp"); > > >>> + gimple_add_tmp_var (tmp); > > >>> + gimplify_assign (tmp, xop0, pre_p); > > >>> + TREE_OPERAND (*expr_p, 0) = tmp; > > >>> r1 = gimplify_expr (&TREE_OPERAND (*expr_p, 1), pre_p, > > >>> post_p, is_gimple_val, fb_rvalue); > > >>> > > >>> all of VEC_COND_EXPR can now be a simple goto expr_3; > > >> > > >> Works for me, thanks! > > >> > > >>> > > >>> diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c > > >>> index 494c9e9c20b..090fb52a2f1 100644 > > >>> --- a/gcc/tree-ssa-forwprop.c > > >>> +++ b/gcc/tree-ssa-forwprop.c > > >>> @@ -3136,6 +3136,10 @@ pass_forwprop::execute (function *fun) > > >>> if (code == COND_EXPR > > >>> || code == VEC_COND_EXPR) > > >>> { > > >>> + /* Do not propagate into VEC_COND_EXPRs. */ > > >>> + if (code == VEC_COND_EXPR) > > >>> + break; > > >>> + > > >>> > > >>> err - remove the || code == VEC_COND_EXPR instead? > > >> > > >> Yep. > > >> > > >>> > > >>> @@ -2221,24 +2226,12 @@ expand_vector_operations (void) > > >>> { > > >>> gimple_stmt_iterator gsi; > > >>> basic_block bb; > > >>> - bool cfg_changed = false; > > >>> > > >>> FOR_EACH_BB_FN (bb, cfun) > > >>> - { > > >>> - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > > >>> - { > > >>> - expand_vector_operations_1 (&gsi); > > >>> - /* ??? If we do not cleanup EH then we will ICE in > > >>> - verification. But in reality we have created wrong-code > > >>> - as we did not properly transition EH info and edges to > > >>> - the piecewise computations. */ > > >>> - if (maybe_clean_eh_stmt (gsi_stmt (gsi)) > > >>> - && gimple_purge_dead_eh_edges (bb)) > > >>> - cfg_changed = true; > > >>> - } > > >>> - } > > >>> > > >>> I'm not sure about this. Consider the C++ testcase where > > >>> the ?: is replaced by a division. If veclower needs to replace > > >>> that with four scalrar division statements then the above > > >>> still applies - veclower does not correctly duplicate EH info > > >>> and EH edges to the individual divisions (and we do not know > > >>> which component might trap). > > >>> > > >>> So please leave the above in. You can try if using integer > > >>> division makes it break and add such a testcase if there's > > >>> no coverage for this in the testsuite. > > >> > > >> I'm leaving that above. Can you please explain how can a division > > >> test-case > > >> be created? > > > > > > typedef long v2di __attribute__((vector_size(16))); > > > > > > v2di foo (v2di a, v2di b) > > > { > > > try > > > { > > > v2di res = a / b; > > > return res; > > > } > > > catch (...) > > > { > > > return (v2di){}; > > > } > > > } > > > > > > with -fnon-call-exceptions I see in t.ii.090t.ehdisp (correctly): > > > > > > ;; basic block 2, loop depth 0 > > > ;; pred: ENTRY > > > [LP 1] _6 = a_4(D) / b_5(D); > > > ;; succ: 5 > > > ;; 3 > > > > > > while after t.ii.226t.veclower we have > > > > > > ;; basic block 2, loop depth 0 > > > ;; pred: ENTRY > > > _13 = BIT_FIELD_REF <a_4(D), 64, 0>; > > > _14 = BIT_FIELD_REF <b_5(D), 64, 0>; > > > _15 = _13 / _14; > > > _16 = BIT_FIELD_REF <a_4(D), 64, 64>; > > > _17 = BIT_FIELD_REF <b_5(D), 64, 64>; > > > _18 = _16 / _17; > > > _6 = {_15, _18}; > > > res_7 = _6; > > > _8 = res_7; > > > ;; succ: 3 > > > > > > and all EH is gone and we'd ICE if you remove the above hunk. Hopefully. > > > > Yes, it ICEs then: > > > > > > ./xg++ -B. ~/Programming/testcases/ice.c -c -fnon-call-exceptions -O3 > > /home/marxin/Programming/testcases/ice.c: In function ‘v2di foo(v2di, > > v2di)’: > > /home/marxin/Programming/testcases/ice.c:3:6: error: statement marked for > > throw, but doesn’t > > 3 | v2di foo (v2di a, v2di b) > > | ^~~ > > _6 = {_12, _15}; > > during GIMPLE pass: veclower2 > > /home/marxin/Programming/testcases/ice.c:3:6: internal compiler error: > > verify_gimple failed > > 0x10e308a verify_gimple_in_cfg(function*, bool) > > /home/marxin/Programming/gcc/gcc/tree-cfg.c:5461 > > 0xfc9caf execute_function_todo > > /home/marxin/Programming/gcc/gcc/passes.c:1985 > > 0xfcaafc do_per_function > > /home/marxin/Programming/gcc/gcc/passes.c:1640 > > 0xfcaafc execute_todo > > /home/marxin/Programming/gcc/gcc/passes.c:2039 > > Please submit a full bug report, > > with preprocessed source if appropriate. > > Please include the complete backtrace with any bug report. > > See <https://gcc.gnu.org/bugs/> for instructions. > > > > > > > > We still generate wrong-code obviously as we'd need to duplicate the > > > EH info on each component division (and split blocks and generate > > > extra EH edges). That's a pre-existing bug of course. I just wanted > > > to avoid to create a new instance just because of the early instruction > > > selection for VEC_COND_EXPR. > > > > Fine! > > > > > > > >>> > > >>> What's missing from the patch is adjusting > > >>> verify_gimple_assign_ternary from > > >>> > > >>> if (((rhs_code == VEC_COND_EXPR || rhs_code == COND_EXPR) > > >>> ? !is_gimple_condexpr (rhs1) : !is_gimple_val (rhs1)) > > >>> || !is_gimple_val (rhs2) > > >>> || !is_gimple_val (rhs3)) > > >>> { > > >>> error ("invalid operands in ternary operation"); > > >>> return true; > > >>> > > >>> to the same with the rhs_code == VEC_COND_EXPR case removed. > > >> > > >> Hmm. I'm not sure I've got this comment. Why do we want to change it > > >> and is it done wright in the patch? > > > > > > Ah, I missed the hunk you added. > > > > That explains the confusion I got. > > > > > But the check should be an inclusive > > > one, not an exclusive one and earlier accepting a is_gimple_condexpr > > > is superfluous when you later reject the tcc_comparison part. Just > > > testing is_gimple_val is better. So yes, remove your tree-cfg.c hunk > > > and just adjust the above test. > > > > I simplified that. > > > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Please double-check the changelog > > (do_store_flag): > > + tree-vect-isel.o \ > > IMHO we want to move more of the pattern matching magic of RTL > expansion here to obsolete TER. So please name it gimple-isel.cc > (.cc!, not .c) > > + gassign *assign = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (cond)); > + if (stmt != NULL > + && TREE_CODE_CLASS (gimple_assign_rhs_code (assign)) != tcc_comparison) > + return ERROR_MARK; > > you want stmt == NULL || TREE_CODE_CLASS (...) > > in case the def stmt is a call. > > + gimple_seq seq; > + tree exp = force_gimple_operand (comb, &seq, true, NULL_TREE); > + if (seq) > + { > + gimple_stmt_iterator gsi = gsi_for_stmt (vcond0); > + gsi_insert_before (&gsi, seq, GSI_SAME_STMT); > + } > > use force_gimple_operand_gsi that makes the above simpler. > > if (invert) > - std::swap (*gimple_assign_rhs2_ptr (stmt0), > - *gimple_assign_rhs3_ptr (stmt0)); > - update_stmt (stmt0); > + std::swap (*gimple_assign_rhs2_ptr (vcond0), > + *gimple_assign_rhs3_ptr (vcond0)); > > use swap_ssa_operands. > > + gimple_assign_set_rhs1 (vcond0, exp); > + update_stmt (vcond0); > > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c > index cf2d979fea1..710b17a7c5c 100644 > --- a/gcc/tree-vect-stmts.c > +++ b/gcc/tree-vect-stmts.c > @@ -9937,8 +9937,8 @@ vectorizable_condition (vec_info *vinfo, > { > vec_cond_rhs = vec_oprnds1[i]; > if (bitop1 == NOP_EXPR) > - vec_compare = build2 (cond_code, vec_cmp_type, > - vec_cond_lhs, vec_cond_rhs); > + vec_compare = gimplify_build2 (gsi, cond_code, vec_cmp_type, > + vec_cond_lhs, vec_cond_rhs); > else > { > > please don't introduce more uses of gimplify_buildN - I'd like to > get rid of those. You can use > > gimple_seq stmts = NULL; > vec_compare = gimple_build (&stmts, cond_code, ...); > gsi_insert_seq_before/after (...); > > OK with those changes.
Applying the patch caused Running target unix//-m32 FAIL: gcc.c-torture/execute/ieee/pr50310.c execution, -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions FAIL: gcc.c-torture/execute/ieee/pr50310.c execution, -O3 -g and FAIL: ext/random/simd_fast_mersenne_twister_engine/operators/inequal.cc (test for excess errors) UNRESOLVED: ext/random/simd_fast_mersenne_twister_engine/operators/inequal.cc compilation failed to produce executable Richard. > Thanks, > Richard. > > > > Thanks, > > Martin > > > > > > > >>> > > >>> You'll likely figure the vectorizer still creates some VEC_COND_EXPRs > > >>> with embedded comparisons. > > >> > > >> I've fixed 2 failing test-cases I mentioned in the previous email. > > >> > > >> Martin > > >> > > >>> > > >>> Thanks, > > >>> Richard. > > >>> > > >>> > > >>>> Martin > > >> > >