On Wed, Feb 10, 2016 at 11:26 AM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: > Thanks Richard for your comments. > I changes algorithm to remove dead scalar statements as you proposed. > > Bootstrap and regression testing did not show any new failures on x86-64. > Is it OK for trunk?
Ok. Thanks, Richard. > Changelog: > 2016-02-10 Yuri Rumyantsev <ysrum...@gmail.com> > > PR tree-optimization/69652 > * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 > to nested loop, did source re-formatting, skip debug statements, > add check on statement with volatile operand, remove dead scalar > statements. > > gcc/testsuite/ChangeLog: > * gcc.dg/torture/pr69652.c: New test. > > > 2016-02-09 15:33 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >> On Fri, Feb 5, 2016 at 3:54 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >>> Hi All, >>> >>> Here is updated patch - I came back to move call statements also since >>> masked loads are presented by internal call. I also assume that for >>> the following simple loop >>> for (i = 0; i < n; i++) >>> if (b1[i]) >>> a1[i] = sqrtf(a2[i] * a2[i] + a3[i] * a3[i]); >>> motion must be done for all vector statements in semi-hammock including >>> SQRT. >>> >>> Bootstrap and regression testing did not show any new failures. >>> Is it OK for trunk? >> >> The patch is incredibly hard to parse due to the re-indenting. Please >> consider sending >> diffs with -b. >> >> This issue exposes that you are moving (masked) stores across loads without >> checking aliasing. In the specific case those loads are dead and thus >> this is safe >> but in general I thought we were checking that we are using the same VUSE >> during the sinking operation. >> >> Thus, I'd rather have >> >> + /* Check that LHS does not have uses outside of STORE_BB. */ >> + res = true; >> + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) >> + { >> + gimple *use_stmt; >> + use_stmt = USE_STMT (use_p); >> + if (is_gimple_debug (use_stmt)) >> + continue; >> + if (gimple_bb (use_stmt) != store_bb) >> + { >> + res = false; >> + break; >> + } >> + } >> >> also check for the dead code case and DCE those stmts here. Like so: >> >> if (has_zero_uses (lhs)) >> { >> gsi_remove (&gsi_from, true); >> continue; >> } >> >> before the above loop. >> >> Richard. >> >>> ChangeLog: >>> >>> 2016-02-05 Yuri Rumyantsev <ysrum...@gmail.com> >>> >>> PR tree-optimization/69652 >>> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 >>> to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all >>> skipped scalar statements, introduce variable LAST_VUSE to keep >>> vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the >>> begining of current masked store processing, did source re-formatting, >>> skip parsing of debug gimples, stop processing if a gimple with >>> volatile operand has been encountered, save scalar statement >>> with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE >>> iterator, change vuse of all saved scalar statements to LAST_VUSE if >>> it makes sence. >>> >>> gcc/testsuite/ChangeLog: >>> * gcc.dg/torture/pr69652.c: New test. >>> >>> 2016-02-04 19:40 GMT+03:00 Jakub Jelinek <ja...@redhat.com>: >>>> On Thu, Feb 04, 2016 at 05:46:27PM +0300, Yuri Rumyantsev wrote: >>>>> Here is a patch that cures the issues with non-correct vuse for scalar >>>>> statements during code motion, i.e. if vuse of scalar statement is >>>>> vdef of masked store which has been sunk to new basic block, we must >>>>> fix it up. The patch also fixed almost all remarks pointed out by >>>>> Jacub. >>>>> >>>>> Bootstrapping and regression testing on v86-64 did not show any new >>>>> failures. >>>>> Is it OK for trunk? >>>>> >>>>> ChangeLog: >>>>> 2016-02-04 Yuri Rumyantsev <ysrum...@gmail.com> >>>>> >>>>> PR tree-optimization/69652 >>>>> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 >>>>> to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all >>>>> skipped scalar statements, introduce variable LAST_VUSE that has >>>>> vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the >>>>> begining of current masked store processing, did source re-formatting, >>>>> skip parsing of debug gimples, stop processing when call or gimple >>>>> with volatile operand habe been encountered, save scalar statement >>>>> with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE >>>>> iterator, change vuse of all saved scalar statements to LAST_VUSE if >>>>> it makes sence. >>>>> >>>>> gcc/testsuite/ChangeLog: >>>>> * gcc.dg/torture/pr69652.c: New test. >>>> >>>> Your mailer breaks ChangeLog formatting, so it is hard to check the >>>> formatting of the ChangeLog entry. >>>> >>>> diff --git a/gcc/testsuite/gcc.dg/torture/pr69652.c >>>> b/gcc/testsuite/gcc.dg/torture/pr69652.c >>>> new file mode 100644 >>>> index 0000000..91f30cf >>>> --- /dev/null >>>> +++ b/gcc/testsuite/gcc.dg/torture/pr69652.c >>>> @@ -0,0 +1,14 @@ >>>> +/* { dg-do compile } */ >>>> +/* { dg-options "-O2 -ffast-math -ftree-vectorize " } */ >>>> +/* { dg-additional-options "-mavx" { target { i?86-*-* x86_64-*-* } } } */ >>>> + >>>> +void fn1(double **matrix, int column, int row, int n) >>>> +{ >>>> + int k; >>>> + for (k = 0; k < n; k++) >>>> + if (matrix[row][k] != matrix[column][k]) >>>> + { >>>> + matrix[column][k] = -matrix[column][k]; >>>> + matrix[row][k] = matrix[row][k] - matrix[column][k]; >>>> + } >>>> +} >>>> \ No newline at end of file >>>> >>>> Please make sure the last line of the test is a new-line. >>>> >>>> @@ -6971,6 +6972,8 @@ optimize_mask_stores (struct loop *loop) >>>> gsi_next (&gsi)) >>>> { >>>> stmt = gsi_stmt (gsi); >>>> + if (is_gimple_debug (stmt)) >>>> + continue; >>>> if (is_gimple_call (stmt) >>>> && gimple_call_internal_p (stmt) >>>> && gimple_call_internal_fn (stmt) == IFN_MASK_STORE) >>>> >>>> This is not needed, you do something only for is_gimple_call, >>>> which is never true if is_gimple_debug, so the code used to be fine as is. >>>> >>>> + /* Skip debug sstatements. */ >>>> >>>> s/ss/s/ >>>> >>>> + if (is_gimple_debug (gsi_stmt (gsi))) >>>> + continue; >>>> + stmt1 = gsi_stmt (gsi); >>>> + /* Do not consider writing to memory,volatile and call >>>> >>>> Missing space after , >>>> >>>> + /* Skip scalar statements. */ >>>> + if (!VECTOR_TYPE_P (TREE_TYPE (lhs))) >>>> + { >>>> + /* If scalar statement has vuse we need to modify it >>>> + when another masked store will be sunk. */ >>>> + if (gimple_vuse (stmt1)) >>>> + scalar_vuse.safe_push (stmt1); >>>> continue; >>>> + } >>>> >>>> I don't think it is safe to take for granted that the scalar stmts are all >>>> going to be DCEd, but I could be wrong. >>>> >>>> + /* Check that LHS does not have uses outside of STORE_BB. */ >>>> + res = true; >>>> + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) >>>> + { >>>> + gimple *use_stmt; >>>> + use_stmt = USE_STMT (use_p); >>>> + if (is_gimple_debug (use_stmt)) >>>> + continue; >>>> >>>> Ignoring debug stmts to make decision whether you move or not is >>>> of course the right thing to do. But IMHO you should remember if >>>> you saw any is_gimple_debug stmts in some bool var. >>>> >>>> + if (gimple_bb (use_stmt) != store_bb) >>>> + { >>>> + res = false; >>>> + break; >>>> + } >>>> + } >>>> + if (!res) >>>> + break; >>>> >>>> - if (gimple_vuse (stmt1) >>>> - && gimple_vuse (stmt1) != gimple_vuse (last_store)) >>>> - break; >>>> + if (gimple_vuse (stmt1) >>>> + && gimple_vuse (stmt1) != gimple_vuse (last_store)) >>>> + break; >>>> >>>> + /* Can move STMT1 to STORE_BB. */ >>>> + if (dump_enabled_p ()) >>>> + { >>>> + dump_printf_loc (MSG_NOTE, vect_location, >>>> + "Move stmt to created bb\n"); >>>> + dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt1, 0); >>>> + } >>>> >>>> And if yes, invalidate them here, because the move would otherwise >>>> generate invalid IL. >>>> >>>> + gsi_move_before (&gsi_from, &gsi_to); >>>> + /* Shift GSI_TO for further insertion. */ >>>> + gsi_prev (&gsi_to); >>>> + } >>>> + /* Put other masked stores with the same mask to STORE_BB. */ >>>> + if (worklist.is_empty () >>>> + || gimple_call_arg (worklist.last (), 2) != mask >>>> + || worklist.last () != stmt1) >>>> + break; >>>> + last = worklist.pop (); >>>> } >>>> add_phi_arg (phi, gimple_vuse (last_store), e, UNKNOWN_LOCATION); >>>> + /* Mask stores motion could crossing scalar statements with vuse >>>> + which should be corrected. */ >>>> >>>> s/crossing/cross/ >>>> That said, I'm not really sure if without some verification if such >>>> reads are really dead it is safe to skip them and update this way. >>>> Richard? >>>> >>>> + last_vuse = gimple_vuse (last_store); >>>> + while (!scalar_vuse.is_empty ()) >>>> + { >>>> + stmt = scalar_vuse.pop (); >>>> + if (gimple_vuse (stmt) != last_vuse) >>>> + { >>>> + gimple_set_vuse (stmt, last_vuse); >>>> + update_stmt (stmt); >>>> + } >>>> + } >>>> } >>>> } >>>> >>>> Jakub